The pai_t interface implements C-style polymorphism via function pointers
to abstract over PAC and HPA. This abstraction provides no real benefit:
only two implementations exist, the dispatcher already knows which one to
use, and HPA stubs 2 of 5 operations. Remove the runtime dispatch in
favor of direct calls.
This commit:
- Promotes pac_alloc/expand/shrink/dalloc/time_until_deferred_work to
external linkage and replaces the pai_t *self parameter with pac_t *pac.
- Promotes hpa_alloc/expand/shrink/dalloc/time_until_deferred_work to
external linkage and replaces pai_t *self with hpa_shard_t *shard.
- Updates hpa_dalloc_batch's signature to take hpa_shard_t * directly
and removes the hpa_from_pai container-of helper. Updates internal
callers in hpa_alloc, hpa_dalloc, and hpa_sec_flush_impl.
- Drops the vtable assignments from pac_init() and hpa_shard_init().
- Replaces pai_alloc/dalloc/etc. dispatch in pa.c with direct calls.
HPA expand and shrink (which are unconditional failure stubs) are
skipped entirely for HPA-owned extents.
- Removes the pa_get_pai() helper.
- Updates tests in test/unit/hpa.c and test/unit/hpa_sec_integration.c
to call hpa_alloc/dalloc/etc. directly.
The pai_t struct field stays as dead weight in pac_t and hpa_shard_t;
it is removed in the next commit along with pai.h itself.
No behavioral changes.
Three changes to make pa_microbench easier to drive for fragmentation
experiments:
- Replace HPA_SHARD_OPTS_DEFAULT use with a single editable g_hpa_opts
global. The microbench does not consult MALLOC_CONF for HPA shard opts,
so this is the place to set the baseline configuration (slab_max_alloc,
hugification_threshold, dirty_mult, hugify_delay_ms, purge_threshold,
hugify_style, etc.).
- Add -n/--nshards N to override the shard count derived from the trace.
When set, each event is routed to (event->shard_ind % N), letting us
study the impact of arena consolidation. Without the flag the behavior
is unchanged (num_shards = max_shard_id + 1).
- Bump MAX_ALLOCATIONS from 10M to 200M so the full ~50M-event adfinder
trace (and similar) fits in the in-memory event buffer.
pa_microbench was creating its own emap_t per shard on top of the
arena_emap_global that JET malloc initializes during jet_malloc(16)
at startup, breaking the production assumption of one rtree per
process. Fix it by reusing the existing JET emap.