This change includes the following improvements:
- Remove the hpa_sec_batch_fill_extra parameter.
- Refactor the hpa_alloc() code and helper functions to be able to
allocate more than one extent out of a single pageslab. This way
we can amortize the per-pageslab costs (active bitmap iteration,
pageslab metadata updates) across multiple extents.
- Decide on a min and max number of extents that will be allocated
in hpa_alloc(). The code will try to allocate at least the min
and allocate up to the max as long as we can allocate additional
ones from the pageslab we already have, as additional allocations
are relatively cheap.
- Add extent allocation distribution stats.
- Amend hpa_sec_integration.c unit test.
The orchestrator looks up the surviving descriptor via
tcache_postfork_arena_descriptor and threads it into
arena_postfork_child, eliminating arena's call into tcache. Also reset
cache_bin_array_descriptor_ql_mtx right before the queue rebuild it
protects.
tcache.c was reaching into arena->cache_bin_array_descriptor_ql{,_mtx}
directly to register / unregister / postfork-relink its descriptor.
That queue and mutex are owned by arena, so the locking and ql_*
operations belong in arena.c.
After tcache_init runs, tcache_slow->tcache == tcache always holds, so
the tcache_t parameter to the three association helpers is derivable
from tcache_slow.
Drop the duplicate arena->tcache_ql; stats merging walks the
cache_bin_array_descriptor_ql directly. Rename the protecting mutex
from tcache_ql_mtx to cache_bin_array_descriptor_ql_mtx to match. Add
an assertion in test_thread_migrate_arena that the dissociate-time
flush zeros cache_bin->tstats.nrequests.
bin_t is an arena implementation detail; tcache should not reach into
it. Extract the slab-address lookup into bin.c as bin_current_slab_addr,
and expose it to tcache only through arena_locality_hint(tsdn, arena,
szind), which composes bin_choose + bin_current_slab_addr.
After replacing PAI vtable dispatch with direct calls in the previous
commit, the embedded pai_t member in pac_t and hpa_shard_t is dead
weight, and pai.h has no remaining users. Remove them.
Changes:
- Drop pai_t pai member (and "must be first member" comment) from
pac_t and hpa_shard_t.
- Replace #include "jemalloc/internal/pai.h" with the actually-needed
edata.h / tsd_types.h in pac.h, hpa.h, sec.h, pa.h.
- Update extent_pai_t comment in edata.h to no longer reference pai.h.
- Update three remaining test files (hpa_thp_always,
hpa_vectorized_madvise, hpa_vectorized_madvise_large_batch) to call
hpa_*(tsdn, shard, ...) directly instead of pai_*(tsdn, &shard->pai,
...).
- Delete include/jemalloc/internal/pai.h.
No behavioral changes.
The pai_t interface implements C-style polymorphism via function pointers
to abstract over PAC and HPA. This abstraction provides no real benefit:
only two implementations exist, the dispatcher already knows which one to
use, and HPA stubs 2 of 5 operations. Remove the runtime dispatch in
favor of direct calls.
This commit:
- Promotes pac_alloc/expand/shrink/dalloc/time_until_deferred_work to
external linkage and replaces the pai_t *self parameter with pac_t *pac.
- Promotes hpa_alloc/expand/shrink/dalloc/time_until_deferred_work to
external linkage and replaces pai_t *self with hpa_shard_t *shard.
- Updates hpa_dalloc_batch's signature to take hpa_shard_t * directly
and removes the hpa_from_pai container-of helper. Updates internal
callers in hpa_alloc, hpa_dalloc, and hpa_sec_flush_impl.
- Drops the vtable assignments from pac_init() and hpa_shard_init().
- Replaces pai_alloc/dalloc/etc. dispatch in pa.c with direct calls.
HPA expand and shrink (which are unconditional failure stubs) are
skipped entirely for HPA-owned extents.
- Removes the pa_get_pai() helper.
- Updates tests in test/unit/hpa.c and test/unit/hpa_sec_integration.c
to call hpa_alloc/dalloc/etc. directly.
The pai_t struct field stays as dead weight in pac_t and hpa_shard_t;
it is removed in the next commit along with pai.h itself.
No behavioral changes.
Some pages (e.g., hugetlb pages) cannot be purged, and should be
prioritized for reuse. A custom extent_alloc hook signals this by
OR'ing EXTENT_ALLOC_FLAG_PINNED into the low bits of the returned
pointer; jemalloc strips the flag bits and caches pinned extents in
a dedicated ecache_pinned, separate from the dirty/muzzy decay
pipeline.
Pinned extents do not coalesce eagerly, except for ones larger than
SC_LARGE_MINCLASS. A prefer-small policy reuses the smallest fitting
pinned extent, to avoid unnecessary split/fragmentation.
Three changes to make pa_microbench easier to drive for fragmentation
experiments:
- Replace HPA_SHARD_OPTS_DEFAULT use with a single editable g_hpa_opts
global. The microbench does not consult MALLOC_CONF for HPA shard opts,
so this is the place to set the baseline configuration (slab_max_alloc,
hugification_threshold, dirty_mult, hugify_delay_ms, purge_threshold,
hugify_style, etc.).
- Add -n/--nshards N to override the shard count derived from the trace.
When set, each event is routed to (event->shard_ind % N), letting us
study the impact of arena consolidation. Without the flag the behavior
is unchanged (num_shards = max_shard_id + 1).
- Bump MAX_ALLOCATIONS from 10M to 200M so the full ~50M-event adfinder
trace (and similar) fits in the in-memory event buffer.
* Replace std::__throw_bad_alloc call with standard C++
Since December of 2025, std::__throw_bad_alloc is no longer visible
through #include <new> causing jemalloc build failures with gcc 16.
As far as I can tell, all std::__throw_bad_alloc did was arrange to
raise a std::bad_alloc exception if exceptions are enabled. I am not
sure whether its usage was truly meaningful in jemalloc since the call
is wrapped in a try catch and any usage of try catch is considered an
error when compiling with -fno-exceptions on gcc, at least.
This change adds a check to configure.ac that determines whether
exceptions are enabled by compiling a simple try catch that raises a
std::bad_alloc exception. If that test succeeds, the macro
JEMALLOC_HAVE_CXX_EXCEPTIONS is defined, and jemalloc will raise an
exception. Otherwise, we call std::terminate() to abort.
This was tested on FreeBSD with the gcc16 port with and without exceptions
enabled.
* Replace std::set_new_handler calls with std::get_new_handler
Previously, std::set_new_handler was used as a workaround for
compilers with only partial support for C++11. Now that C++14 is a
requirement to enable C++ support, we can assume std::get_new_handler
is available.
pa_microbench was creating its own emap_t per shard on top of the
arena_emap_global that JET malloc initializes during jet_malloc(16)
at startup, breaking the production assumption of one rtree per
process. Fix it by reusing the existing JET emap.
* Document new mallctl interfaces added since 5.3.0
Add documentation for the following new mallctl entries:
- opt.debug_double_free_max_scan: double-free detection scan limit
- opt.prof_bt_max: max profiling backtrace depth
- opt.disable_large_size_classes: page-aligned large allocations
- opt.process_madvise_max_batch: batched process_madvise purging
- thread.tcache.max: per-thread tcache_max control
- thread.tcache.ncached_max.read_sizeclass: query ncached_max
- thread.tcache.ncached_max.write: set ncached_max per size range
- arena.<i>.name: get/set arena names
- arenas.hugepage: hugepage size
- approximate_stats.active: lightweight active bytes estimate
Remove config.prof_frameptr since it still needs more development
and is still experimental.
Co-authored-by: lexprfuncall <carl.shapiro@gmail.com>