Arena 0 have a dedicated initialization path, which differs from
initialization path of other arenas. The main difference for the purpose
of this change is that we initialize arena 0 before we initialize
background threads. HPA shard options have `deferral_allowed` flag which
should be equal to `background_thread_enabled()` return value, but it
wasn't the case before this change, because for arena 0
`background_thread_enabled()` was initialized correctly after arena 0
initialization phase already ended.
Below is initialization sequence for arena 0 after this commit to
illustrate everything still should be initialized correctly.
* `hpa_central_init` initializes HPA Central, before we initialize every
HPA shard (including arena's 0).
* `background_thread_boot1` initializes `background_thread_enabled()`
return value.
* `pa_shard_enable_hpa` initializes arena 0 HPA shard.
```
malloc_init_hard -------------
/ / \
/ / \
/ / \
malloc_init_hard_a0_locked background_thread_boot1 pa_shard_enable_hpa
/ / \
/ / \
/ / \
arena_boot background_thread_enabled_seta hpa_shard_init
|
|
pa_central_init
|
|
hpa_central_init
```
Currently, hugepages aware allocator backend works together with classic
one as a fallback for not yet supported allocations. When background
threads are enabled wake up time for classic interfere with hpa as there
were no checks inside hpa purging logic to check if we are not purging too
frequently. If background thread is running and `hpa_should_purge`
returns true, then we will purge, even if we purged less than
hpa_min_purge_interval_ms ago.
Under high concurrency / heavy test load (e.g. using run_tests.sh), the
background thread may not get scheduled for a longer period of time. Retry 100
times max before bailing out.
Many profiling related tests make assumptions on the profiling settings,
e.g. opt_prof is off by default, and prof_active is default on when opt_prof is
on. However the default settings can be changed via --with-malloc-conf at build
time. Fixing the tests by adding the assumed settings explicitly.
Adding guarded extents, which are regular extents surrounded by guard pages
(mprotected). To reduce syscalls, small guarded extents are cached as a
separate eset in ecache, and decay through the dirty / muzzy / retained pipeline
as usual.
This change allows every allocator conforming to PAI communicate that it
deferred some work for the future. Without it if a background thread goes into
indefinite sleep, there is no way to notify it about upcoming deferred work.