After the per-TU explicit-include conversion in batches 1-7, no .c
file in src/ needs the umbrella anymore. test/jemalloc_test.h.in is
the last consumer; it's the template for the header used by unit
and stress tests that want all of jemalloc's internals visible, so
its #include of the umbrella is replaced with the same set of
explicit includes the umbrella used to expand to. No behavioral
change for tests.
With that, the umbrella is gone. Every translation unit now
declares the headers it actually uses, and the hidden-transitive-
include patterns that motivated this cleanup are no longer possible
to introduce silently -- a missing include now fails at the failing
file rather than silently working because something upstream pulled
in the world.
Step 6 (Option B) of the cyclical-dep cleanup, complete.
arena_types.h + arena_structs.h + arena_externs.h merged into arena.h,
keeping the three logical sections (TYPES / STRUCTS / EXTERNS) with
explicit dividers. arena_inlines_a.h and arena_inlines_b.h stay
separate; arena_inlines_b.h now carries a comment explaining why
merging the two would reintroduce a real #include cycle through
tcache_inlines.h -> arena_choose (the asymmetric cycle-breaker).
Two ordering gotchas this consolidation surfaced:
1. tsd_internals.h is included from tsd.h via tsd_generic.h, sometimes
long before arena.h is loaded (e.g. ckh.c includes ckh.h -> tsd.h
before jemalloc_internal_includes.h). TSD_INITIALIZER's expansion
in tsd_generic.h's function bodies references
ARENA_DECAY_NTICKS_PER_UPDATE, so it must already be defined.
Factor the constant into a new minimal header,
arena_decay_constants.h, that pulls nothing but jemalloc_preamble.h,
and include it from both arena.h and tsd_internals.h. arena_t is
still added as a forward decl in tsd_internals.h -- including
arena.h there would trigger arena_stats.h -> mutex.h -> tsd.h ->
re-entry into this very file.
2. extent_dss.h previously included arena_types.h for the arena_t
pointer type, but arena.h now includes extent_dss.h (it was a
STRUCTS-section dep). Forward-decl arena_t in extent_dss.h to
break that cycle.
Additional forward decls in tcache.h and large.h (arena_t *). These
were previously satisfied by the master include order loading
arena_types.h before everything else; with arena.h now in the EXTERNS
section, large.h and tcache.h are parsed earlier than arena.h, so
they need to declare arena_t themselves.
jemalloc_internal_externs.h's #include of arena_types.h was
vestigial -- the file uses no arena symbols. Dropped.
Folds several historical *_types/_structs/_externs/_inlines splits where
the layering is no longer load-bearing.
- large_externs.h -> large.h: renamed; it was a single-purpose
function-prototype file.
- background_thread_structs.h + background_thread_externs.h ->
background_thread.h: merged. background_thread_inlines.h is kept
separate because it depends on arena_inlines_a.h.
- bin_inlines.h folded into bin.h, along with BIN_SHARDS_MAX /
N_BIN_SHARDS_DEFAULT from bin_types.h. bin.h carries a forward decl
of arena_binind_div_info (declared in arena_externs.h) so it stays
hermetic without re-introducing the bin.h <-> arena_externs.h cycle.
- tsd_binshards.h (new): houses tsd_binshards_t and its zero
initializer. Keeping these out of bin.h lets tsd_internals.h pull in
just what it needs during X-macro expansion, avoiding bin.h's mutex.h
dependency (mutex.h itself depends on TSD machinery, so routing it
through tsd_internals.h forms a chicken-and-egg).
jemalloc_internal_includes.h: drops the now-redundant references to
the deleted/merged headers.
Fix FreeBSD postfork child handler never being called: FreeBSD's libthr
calls _malloc_postfork in both parent and child (see freebsd-src
lib/libthr/thread/thr_fork.c), but jemalloc mapped it to the parent
handler only. Detect the child via getpid() and route to
jemalloc_postfork_child, which resets nthreads and rebuilds the
descriptor queue.
Remove the child_survivor_bytes vs pre_survivor_bytes comparison: on
macOS where jemalloc registers as the default zone, internal allocations
during the postfork handler (pthread_mutex_init) can inflate the
surviving thread's tcache.
Add double-fork test to verify prefork pid is refreshed correctly when a
child process forks again.
This change includes the following improvements:
- Remove the hpa_sec_batch_fill_extra parameter.
- Refactor the hpa_alloc() code and helper functions to be able to
allocate more than one extent out of a single pageslab. This way
we can amortize the per-pageslab costs (active bitmap iteration,
pageslab metadata updates) across multiple extents.
- Decide on a min and max number of extents that will be allocated
in hpa_alloc(). The code will try to allocate at least the min
and allocate up to the max as long as we can allocate additional
ones from the pageslab we already have, as additional allocations
are relatively cheap.
- Add extent allocation distribution stats.
- Amend hpa_sec_integration.c unit test.
Drop the duplicate arena->tcache_ql; stats merging walks the
cache_bin_array_descriptor_ql directly. Rename the protecting mutex
from tcache_ql_mtx to cache_bin_array_descriptor_ql_mtx to match. Add
an assertion in test_thread_migrate_arena that the dissociate-time
flush zeros cache_bin->tstats.nrequests.
bin_t is an arena implementation detail; tcache should not reach into
it. Extract the slab-address lookup into bin.c as bin_current_slab_addr,
and expose it to tcache only through arena_locality_hint(tsdn, arena,
szind), which composes bin_choose + bin_current_slab_addr.
After replacing PAI vtable dispatch with direct calls in the previous
commit, the embedded pai_t member in pac_t and hpa_shard_t is dead
weight, and pai.h has no remaining users. Remove them.
Changes:
- Drop pai_t pai member (and "must be first member" comment) from
pac_t and hpa_shard_t.
- Replace #include "jemalloc/internal/pai.h" with the actually-needed
edata.h / tsd_types.h in pac.h, hpa.h, sec.h, pa.h.
- Update extent_pai_t comment in edata.h to no longer reference pai.h.
- Update three remaining test files (hpa_thp_always,
hpa_vectorized_madvise, hpa_vectorized_madvise_large_batch) to call
hpa_*(tsdn, shard, ...) directly instead of pai_*(tsdn, &shard->pai,
...).
- Delete include/jemalloc/internal/pai.h.
No behavioral changes.
The pai_t interface implements C-style polymorphism via function pointers
to abstract over PAC and HPA. This abstraction provides no real benefit:
only two implementations exist, the dispatcher already knows which one to
use, and HPA stubs 2 of 5 operations. Remove the runtime dispatch in
favor of direct calls.
This commit:
- Promotes pac_alloc/expand/shrink/dalloc/time_until_deferred_work to
external linkage and replaces the pai_t *self parameter with pac_t *pac.
- Promotes hpa_alloc/expand/shrink/dalloc/time_until_deferred_work to
external linkage and replaces pai_t *self with hpa_shard_t *shard.
- Updates hpa_dalloc_batch's signature to take hpa_shard_t * directly
and removes the hpa_from_pai container-of helper. Updates internal
callers in hpa_alloc, hpa_dalloc, and hpa_sec_flush_impl.
- Drops the vtable assignments from pac_init() and hpa_shard_init().
- Replaces pai_alloc/dalloc/etc. dispatch in pa.c with direct calls.
HPA expand and shrink (which are unconditional failure stubs) are
skipped entirely for HPA-owned extents.
- Removes the pa_get_pai() helper.
- Updates tests in test/unit/hpa.c and test/unit/hpa_sec_integration.c
to call hpa_alloc/dalloc/etc. directly.
The pai_t struct field stays as dead weight in pac_t and hpa_shard_t;
it is removed in the next commit along with pai.h itself.
No behavioral changes.
Some pages (e.g., hugetlb pages) cannot be purged, and should be
prioritized for reuse. A custom extent_alloc hook signals this by
OR'ing EXTENT_ALLOC_FLAG_PINNED into the low bits of the returned
pointer; jemalloc strips the flag bits and caches pinned extents in
a dedicated ecache_pinned, separate from the dirty/muzzy decay
pipeline.
Pinned extents do not coalesce eagerly, except for ones larger than
SC_LARGE_MINCLASS. A prefer-small policy reuses the smallest fitting
pinned extent, to avoid unnecessary split/fragmentation.
Three changes to make pa_microbench easier to drive for fragmentation
experiments:
- Replace HPA_SHARD_OPTS_DEFAULT use with a single editable g_hpa_opts
global. The microbench does not consult MALLOC_CONF for HPA shard opts,
so this is the place to set the baseline configuration (slab_max_alloc,
hugification_threshold, dirty_mult, hugify_delay_ms, purge_threshold,
hugify_style, etc.).
- Add -n/--nshards N to override the shard count derived from the trace.
When set, each event is routed to (event->shard_ind % N), letting us
study the impact of arena consolidation. Without the flag the behavior
is unchanged (num_shards = max_shard_id + 1).
- Bump MAX_ALLOCATIONS from 10M to 200M so the full ~50M-event adfinder
trace (and similar) fits in the in-memory event buffer.
pa_microbench was creating its own emap_t per shard on top of the
arena_emap_global that JET malloc initializes during jet_malloc(16)
at startup, breaking the production assumption of one rtree per
process. Fix it by reusing the existing JET emap.
When san_bump_grow_locked fails, it sets sba->curr_reg to NULL.
The old curr_reg (saved in to_destroy) was never freed or restored,
leaking the virtual memory extent. Restore sba->curr_reg from
to_destroy on failure so the old region remains usable.
When emap_try_acquire_edata_neighbor returned a non-NULL neighbor but
the size check failed, the neighbor was never released from
extent_state_merging, making it permanently invisible to future
allocation and coalescing operations.
Release the neighbor when it doesn't meet the size requirement,
matching the pattern used in extent_recycle_extract.
When called with size==0, the else branch wrote to str[size-1] which
is str[(size_t)-1], a massive out-of-bounds write. Standard vsnprintf
allows size==0 to mean "compute length only, write nothing".
Add unit test for the size==0 case.
Same pattern as arenas_bin_i_index: used > instead of >= allowing
access one past the end of bstats[] and lstats[] arrays.
Add unit tests that verify boundary indices return ENOENT.
The second expansion attempt in large_ralloc_no_move omitted the !
before large_ralloc_no_move_expand(), inverting the return value.
On expansion failure, the function falsely reported success, making
callers believe the allocation was expanded in-place when it was not.
On expansion success, the function falsely reported failure, causing
callers to unnecessarily allocate, copy, and free.
Add unit test that verifies the return value matches actual size change.
In both the full_slabs and empty_slabs JSON sections of HPA shard
stats, "nactive_huge" was emitted twice instead of emitting
"ndirty_huge" as the second entry. This caused ndirty_huge to be
missing from the JSON output entirely.
Add a unit test that verifies both sections contain "ndirty_huge".
The index validation used > instead of >=, allowing access at index
SC_NBINS (for bins) and SC_NSIZES-SC_NBINS (for lextents), which are
one past the valid range. This caused out-of-bounds reads in bin_infos[]
and sz_index2size_unsafe().
Add unit tests that verify the boundary indices return ENOENT.
psset_pick_purge used max_bit-- after rejecting a time-ineligible
candidate, which caused unnecessary re-scanning of the same bitmap
and makes assert fail in debug mode) and a size_t underflow
when the lowest-index entry was rejected. Use max_bit = ind - 1
to skip directly past the rejected index.
tsd_tcache_data_init() returns true on failure but its callers ignore
this return value, leaving the per-thread tcache in an uninitialized
state after a failure.
This change disables the tcache on an initialization failure and logs
an error message. If opt_abort is true, it will also abort.
New unit tests have been added to test tcache initialization failures.
Add mechanism to be able to select a test to run from a test file. The test harness will read the JEMALLOC_TEST_NAME env and, if set, it will only run subtests with that name.