Convert the production source files in src/ (69 .c/.cpp) and
test/jemalloc_test.h.in to list the headers they actually use, then
delete the umbrella. Three consolidated headers (peak_event.h,
prof_sys.h, sz.h) also gain explicit transitive includes.
Every translation unit now declares what it uses. A missing include
now fails at the failing file rather than silently working because
something upstream pulled in the world.
arena_types.h + arena_structs.h + arena_externs.h merged into arena.h,
keeping the three logical sections (TYPES / STRUCTS / EXTERNS) with
explicit dividers. arena_inlines_a.h and arena_inlines_b.h stay
separate; arena_inlines_b.h now carries a comment explaining why
merging the two would reintroduce a real #include cycle through
tcache_inlines.h -> arena_choose (the asymmetric cycle-breaker).
arena_decay_constants.h (new): minimal header for
ARENA_DECAY_NTICKS_PER_UPDATE.
Both components had a four-way split (_types, _structs, _externs,
_inlines) that predates explicit per-file includes. With the edata
<-> prof_types coupling broken in the prior commit, merging _types +
_structs + _externs no longer risks an include cycle.
- prof.h replaces prof_types.h + prof_structs.h + prof_externs.h.
- tcache.h replaces tcache_types.h + tcache_structs.h + tcache_externs.h.
prof_inlines.h and tcache_inlines.h stay separate: prof_inlines.h
sits at the bottom of the dependency layering, and tcache_inlines.h's
include of arena_externs.h is the asymmetric cycle-breaker that keeps
the arena <-> tcache symbol cycle from becoming an include cycle.
Fold historical *_types/_structs/_externs/_inlines splits where the
layering is no longer load-bearing.
- large_externs.h -> large.h: rename; it was a single-purpose
function-prototype file.
- background_thread_structs.h + background_thread_externs.h ->
background_thread.h: merge. background_thread_inlines.h stays
separate because it depends on arena_inlines_a.h.
- bin_types.h -> bin.h: BIN_SHARDS_MAX / N_BIN_SHARDS_DEFAULT join the
bin_t struct and the bin_dalloc_locked_info_t type. bin_inlines.h
stays separate so TUs that only need the bin_t type don't pull in
the tcache-flush inline closure. The two bin_stats_* helpers move
from bin.h into bin_inlines.h so all bin inlines live together.
- tsd_binshards.h (new): houses tsd_binshards_t and its zero
initializer. Lets tsd_internals.h pull it in for X-macro expansion
without dragging mutex.h -- mutex.h depends on TSD machinery, which
would form a cycle through bin.h.
jemalloc_internal_includes.h drops the references to the deleted/
merged headers.
- test_hooks.h: drop the #include of jemalloc_preamble.h. preamble
pulls test_hooks.h, and test_hooks.h needs nothing from preamble.
- edata.h: drop the #include of prof_types.h in favor of forward
declarations of prof_tctx_t and prof_recent_t (used only as pointer
types). prof_structs.h can then drop its #include of edata.h,
severing the edata <-> prof_types coupling.
Pull the tcache-aware routing helpers out of arena into a layer that
sits directly below the public malloc interface:
arena_malloc -> malloc_dispatch_malloc
arena_palloc -> malloc_dispatch_palloc
arena_ralloc -> malloc_dispatch_ralloc
arena_dalloc* -> malloc_dispatch_dalloc*
arena_sdalloc* -> malloc_dispatch_sdalloc*
arena_dalloc_promoted -> malloc_dispatch_dalloc_promoted
The new module (malloc_dispatch.h, malloc_dispatch_inlines.h,
src/malloc_dispatch.c) owns the tcache-vs-fall-through decision; the
only consumer is jemalloc_internal_inlines_c.h. arena keeps a narrower
arena_prof_demote() for the sampled-allocation demotion path.
Fix FreeBSD postfork child handler never being called: FreeBSD's libthr
calls _malloc_postfork in both parent and child (see freebsd-src
lib/libthr/thread/thr_fork.c), but jemalloc mapped it to the parent
handler only. Detect the child via getpid() and route to
jemalloc_postfork_child, which resets nthreads and rebuilds the
descriptor queue.
Remove the child_survivor_bytes vs pre_survivor_bytes comparison: on
macOS where jemalloc registers as the default zone, internal allocations
during the postfork handler (pthread_mutex_init) can inflate the
surviving thread's tcache.
Add double-fork test to verify prefork pid is refreshed correctly when a
child process forks again.
Zero-sized arrays are not allowed by ISO C.
C99 introduced a way to express this.
Type-checking fails, because all_bins is asigned malloced
storage of length > 0.
Found by GCC and Clang (-Wpedantic).
This change includes the following improvements:
- Remove the hpa_sec_batch_fill_extra parameter.
- Refactor the hpa_alloc() code and helper functions to be able to
allocate more than one extent out of a single pageslab. This way
we can amortize the per-pageslab costs (active bitmap iteration,
pageslab metadata updates) across multiple extents.
- Decide on a min and max number of extents that will be allocated
in hpa_alloc(). The code will try to allocate at least the min
and allocate up to the max as long as we can allocate additional
ones from the pageslab we already have, as additional allocations
are relatively cheap.
- Add extent allocation distribution stats.
- Amend hpa_sec_integration.c unit test.
The orchestrator looks up the surviving descriptor via
tcache_postfork_arena_descriptor and threads it into
arena_postfork_child, eliminating arena's call into tcache. Also reset
cache_bin_array_descriptor_ql_mtx right before the queue rebuild it
protects.
tcache.c was reaching into arena->cache_bin_array_descriptor_ql{,_mtx}
directly to register / unregister / postfork-relink its descriptor.
That queue and mutex are owned by arena, so the locking and ql_*
operations belong in arena.c.
After tcache_init runs, tcache_slow->tcache == tcache always holds, so
the tcache_t parameter to the three association helpers is derivable
from tcache_slow.
Drop the duplicate arena->tcache_ql; stats merging walks the
cache_bin_array_descriptor_ql directly. Rename the protecting mutex
from tcache_ql_mtx to cache_bin_array_descriptor_ql_mtx to match. Add
an assertion in test_thread_migrate_arena that the dissociate-time
flush zeros cache_bin->tstats.nrequests.
bin_t is an arena implementation detail; tcache should not reach into
it. Extract the slab-address lookup into bin.c as bin_current_slab_addr,
and expose it to tcache only through arena_locality_hint(tsdn, arena,
szind), which composes bin_choose + bin_current_slab_addr.
After replacing PAI vtable dispatch with direct calls in the previous
commit, the embedded pai_t member in pac_t and hpa_shard_t is dead
weight, and pai.h has no remaining users. Remove them.
Changes:
- Drop pai_t pai member (and "must be first member" comment) from
pac_t and hpa_shard_t.
- Replace #include "jemalloc/internal/pai.h" with the actually-needed
edata.h / tsd_types.h in pac.h, hpa.h, sec.h, pa.h.
- Update extent_pai_t comment in edata.h to no longer reference pai.h.
- Update three remaining test files (hpa_thp_always,
hpa_vectorized_madvise, hpa_vectorized_madvise_large_batch) to call
hpa_*(tsdn, shard, ...) directly instead of pai_*(tsdn, &shard->pai,
...).
- Delete include/jemalloc/internal/pai.h.
No behavioral changes.
The pai_t interface implements C-style polymorphism via function pointers
to abstract over PAC and HPA. This abstraction provides no real benefit:
only two implementations exist, the dispatcher already knows which one to
use, and HPA stubs 2 of 5 operations. Remove the runtime dispatch in
favor of direct calls.
This commit:
- Promotes pac_alloc/expand/shrink/dalloc/time_until_deferred_work to
external linkage and replaces the pai_t *self parameter with pac_t *pac.
- Promotes hpa_alloc/expand/shrink/dalloc/time_until_deferred_work to
external linkage and replaces pai_t *self with hpa_shard_t *shard.
- Updates hpa_dalloc_batch's signature to take hpa_shard_t * directly
and removes the hpa_from_pai container-of helper. Updates internal
callers in hpa_alloc, hpa_dalloc, and hpa_sec_flush_impl.
- Drops the vtable assignments from pac_init() and hpa_shard_init().
- Replaces pai_alloc/dalloc/etc. dispatch in pa.c with direct calls.
HPA expand and shrink (which are unconditional failure stubs) are
skipped entirely for HPA-owned extents.
- Removes the pa_get_pai() helper.
- Updates tests in test/unit/hpa.c and test/unit/hpa_sec_integration.c
to call hpa_alloc/dalloc/etc. directly.
The pai_t struct field stays as dead weight in pac_t and hpa_shard_t;
it is removed in the next commit along with pai.h itself.
No behavioral changes.