The split managed one ordering constraint: arena_choose() had to be
defined before arena_choose_maybe_huge() but after the tsd/tcache
inlines it depends on. After the malloc_dispatch refactor moved the
heaviest tcache-pulling inlines out of arena_inlines_b.h, the
remaining arena-side inlines all belong together. The merged
arena_inlines.h explicitly includes jemalloc_internal_inlines_a.h
and tcache.h (previously transitively pulled).
Convert the production source files in src/ (69 .c/.cpp) and
test/jemalloc_test.h.in to list the headers they actually use, then
delete the umbrella. Three consolidated headers (peak_event.h,
prof_sys.h, sz.h) also gain explicit transitive includes.
Every translation unit now declares what it uses. A missing include
now fails at the failing file rather than silently working because
something upstream pulled in the world.
arena_types.h + arena_structs.h + arena_externs.h merged into arena.h,
keeping the three logical sections (TYPES / STRUCTS / EXTERNS) with
explicit dividers. arena_inlines_a.h and arena_inlines_b.h stay
separate; arena_inlines_b.h now carries a comment explaining why
merging the two would reintroduce a real #include cycle through
tcache_inlines.h -> arena_choose (the asymmetric cycle-breaker).
arena_decay_constants.h (new): minimal header for
ARENA_DECAY_NTICKS_PER_UPDATE.
Both components had a four-way split (_types, _structs, _externs,
_inlines) that predates explicit per-file includes. With the edata
<-> prof_types coupling broken in the prior commit, merging _types +
_structs + _externs no longer risks an include cycle.
- prof.h replaces prof_types.h + prof_structs.h + prof_externs.h.
- tcache.h replaces tcache_types.h + tcache_structs.h + tcache_externs.h.
prof_inlines.h and tcache_inlines.h stay separate: prof_inlines.h
sits at the bottom of the dependency layering, and tcache_inlines.h's
include of arena_externs.h is the asymmetric cycle-breaker that keeps
the arena <-> tcache symbol cycle from becoming an include cycle.
Fold historical *_types/_structs/_externs/_inlines splits where the
layering is no longer load-bearing.
- large_externs.h -> large.h: rename; it was a single-purpose
function-prototype file.
- background_thread_structs.h + background_thread_externs.h ->
background_thread.h: merge. background_thread_inlines.h stays
separate because it depends on arena_inlines_a.h.
- bin_types.h -> bin.h: BIN_SHARDS_MAX / N_BIN_SHARDS_DEFAULT join the
bin_t struct and the bin_dalloc_locked_info_t type. bin_inlines.h
stays separate so TUs that only need the bin_t type don't pull in
the tcache-flush inline closure. The two bin_stats_* helpers move
from bin.h into bin_inlines.h so all bin inlines live together.
- tsd_binshards.h (new): houses tsd_binshards_t and its zero
initializer. Lets tsd_internals.h pull it in for X-macro expansion
without dragging mutex.h -- mutex.h depends on TSD machinery, which
would form a cycle through bin.h.
jemalloc_internal_includes.h drops the references to the deleted/
merged headers.
- test_hooks.h: drop the #include of jemalloc_preamble.h. preamble
pulls test_hooks.h, and test_hooks.h needs nothing from preamble.
- edata.h: drop the #include of prof_types.h in favor of forward
declarations of prof_tctx_t and prof_recent_t (used only as pointer
types). prof_structs.h can then drop its #include of edata.h,
severing the edata <-> prof_types coupling.
Pull the tcache-aware routing helpers out of arena into a layer that
sits directly below the public malloc interface:
arena_malloc -> malloc_dispatch_malloc
arena_palloc -> malloc_dispatch_palloc
arena_ralloc -> malloc_dispatch_ralloc
arena_dalloc* -> malloc_dispatch_dalloc*
arena_sdalloc* -> malloc_dispatch_sdalloc*
arena_dalloc_promoted -> malloc_dispatch_dalloc_promoted
The new module (malloc_dispatch.h, malloc_dispatch_inlines.h,
src/malloc_dispatch.c) owns the tcache-vs-fall-through decision; the
only consumer is jemalloc_internal_inlines_c.h. arena keeps a narrower
arena_prof_demote() for the sampled-allocation demotion path.
Zero-sized arrays are not allowed by ISO C.
C99 introduced a way to express this.
Type-checking fails, because all_bins is asigned malloced
storage of length > 0.
Found by GCC and Clang (-Wpedantic).
This change includes the following improvements:
- Remove the hpa_sec_batch_fill_extra parameter.
- Refactor the hpa_alloc() code and helper functions to be able to
allocate more than one extent out of a single pageslab. This way
we can amortize the per-pageslab costs (active bitmap iteration,
pageslab metadata updates) across multiple extents.
- Decide on a min and max number of extents that will be allocated
in hpa_alloc(). The code will try to allocate at least the min
and allocate up to the max as long as we can allocate additional
ones from the pageslab we already have, as additional allocations
are relatively cheap.
- Add extent allocation distribution stats.
- Amend hpa_sec_integration.c unit test.
The orchestrator looks up the surviving descriptor via
tcache_postfork_arena_descriptor and threads it into
arena_postfork_child, eliminating arena's call into tcache. Also reset
cache_bin_array_descriptor_ql_mtx right before the queue rebuild it
protects.
tcache.c was reaching into arena->cache_bin_array_descriptor_ql{,_mtx}
directly to register / unregister / postfork-relink its descriptor.
That queue and mutex are owned by arena, so the locking and ql_*
operations belong in arena.c.
After tcache_init runs, tcache_slow->tcache == tcache always holds, so
the tcache_t parameter to the three association helpers is derivable
from tcache_slow.
Drop the duplicate arena->tcache_ql; stats merging walks the
cache_bin_array_descriptor_ql directly. Rename the protecting mutex
from tcache_ql_mtx to cache_bin_array_descriptor_ql_mtx to match. Add
an assertion in test_thread_migrate_arena that the dissociate-time
flush zeros cache_bin->tstats.nrequests.
bin_t is an arena implementation detail; tcache should not reach into
it. Extract the slab-address lookup into bin.c as bin_current_slab_addr,
and expose it to tcache only through arena_locality_hint(tsdn, arena,
szind), which composes bin_choose + bin_current_slab_addr.
After replacing PAI vtable dispatch with direct calls in the previous
commit, the embedded pai_t member in pac_t and hpa_shard_t is dead
weight, and pai.h has no remaining users. Remove them.
Changes:
- Drop pai_t pai member (and "must be first member" comment) from
pac_t and hpa_shard_t.
- Replace #include "jemalloc/internal/pai.h" with the actually-needed
edata.h / tsd_types.h in pac.h, hpa.h, sec.h, pa.h.
- Update extent_pai_t comment in edata.h to no longer reference pai.h.
- Update three remaining test files (hpa_thp_always,
hpa_vectorized_madvise, hpa_vectorized_madvise_large_batch) to call
hpa_*(tsdn, shard, ...) directly instead of pai_*(tsdn, &shard->pai,
...).
- Delete include/jemalloc/internal/pai.h.
No behavioral changes.
The pai_t interface implements C-style polymorphism via function pointers
to abstract over PAC and HPA. This abstraction provides no real benefit:
only two implementations exist, the dispatcher already knows which one to
use, and HPA stubs 2 of 5 operations. Remove the runtime dispatch in
favor of direct calls.
This commit:
- Promotes pac_alloc/expand/shrink/dalloc/time_until_deferred_work to
external linkage and replaces the pai_t *self parameter with pac_t *pac.
- Promotes hpa_alloc/expand/shrink/dalloc/time_until_deferred_work to
external linkage and replaces pai_t *self with hpa_shard_t *shard.
- Updates hpa_dalloc_batch's signature to take hpa_shard_t * directly
and removes the hpa_from_pai container-of helper. Updates internal
callers in hpa_alloc, hpa_dalloc, and hpa_sec_flush_impl.
- Drops the vtable assignments from pac_init() and hpa_shard_init().
- Replaces pai_alloc/dalloc/etc. dispatch in pa.c with direct calls.
HPA expand and shrink (which are unconditional failure stubs) are
skipped entirely for HPA-owned extents.
- Removes the pa_get_pai() helper.
- Updates tests in test/unit/hpa.c and test/unit/hpa_sec_integration.c
to call hpa_alloc/dalloc/etc. directly.
The pai_t struct field stays as dead weight in pac_t and hpa_shard_t;
it is removed in the next commit along with pai.h itself.
No behavioral changes.
Some pages (e.g., hugetlb pages) cannot be purged, and should be
prioritized for reuse. A custom extent_alloc hook signals this by
OR'ing EXTENT_ALLOC_FLAG_PINNED into the low bits of the returned
pointer; jemalloc strips the flag bits and caches pinned extents in
a dedicated ecache_pinned, separate from the dirty/muzzy decay
pipeline.
Pinned extents do not coalesce eagerly, except for ones larger than
SC_LARGE_MINCLASS. A prefer-small policy reuses the smallest fitting
pinned extent, to avoid unnecessary split/fragmentation.
* Replace std::__throw_bad_alloc call with standard C++
Since December of 2025, std::__throw_bad_alloc is no longer visible
through #include <new> causing jemalloc build failures with gcc 16.
As far as I can tell, all std::__throw_bad_alloc did was arrange to
raise a std::bad_alloc exception if exceptions are enabled. I am not
sure whether its usage was truly meaningful in jemalloc since the call
is wrapped in a try catch and any usage of try catch is considered an
error when compiling with -fno-exceptions on gcc, at least.
This change adds a check to configure.ac that determines whether
exceptions are enabled by compiling a simple try catch that raises a
std::bad_alloc exception. If that test succeeds, the macro
JEMALLOC_HAVE_CXX_EXCEPTIONS is defined, and jemalloc will raise an
exception. Otherwise, we call std::terminate() to abort.
This was tested on FreeBSD with the gcc16 port with and without exceptions
enabled.
* Replace std::set_new_handler calls with std::get_new_handler
Previously, std::set_new_handler was used as a workaround for
compilers with only partial support for C++11. Now that C++14 is a
requirement to enable C++ support, we can assume std::get_new_handler
is available.
The second expansion attempt in large_ralloc_no_move omitted the !
before large_ralloc_no_move_expand(), inverting the return value.
On expansion failure, the function falsely reported success, making
callers believe the allocation was expanded in-place when it was not.
On expansion success, the function falsely reported failure, causing
callers to unnecessarily allocate, copy, and free.
Add unit test that verifies the return value matches actual size change.