Commit graph

225 commits

Author SHA1 Message Date
Slobodan Predolac
19bbefe136 Remove dead code: extent_commit_wrapper, large_salloc, tcache_gc_dalloc event waits
These functions had zero callers anywhere in the codebase:
- extent_commit_wrapper: wrapper never called, _impl used directly
- large_salloc: trivial wrapper never called
- tcache_gc_dalloc_new_event_wait: no header declaration, no callers
- tcache_gc_dalloc_postponed_event_wait: no header declaration, no callers
2026-04-01 17:48:19 -04:00
Carl Shapiro
a056c20d67 Handle tcache init failures gracefully
tsd_tcache_data_init() returns true on failure but its callers ignore
this return value, leaving the per-thread tcache in an uninitialized
state after a failure.

This change disables the tcache on an initialization failure and logs
an error message.  If opt_abort is true, it will also abort.

New unit tests have been added to test tcache initialization failures.
2026-03-10 18:14:33 -07:00
Carl Shapiro
1cc563f531 Move bin functions from arena.c to bin.c
This is a clean-up change that gives the bin functions implemented in
the area code a prefix of bin_ and moves them into the bin code.

To further decouple the bin code from the arena code, bin functions
that had taken an arena_t to check arena_is_auto now take an is_auto
parameter instead.
2026-03-10 18:14:33 -07:00
Shirui Cheng
6d4611197e move fill/flush pointer array out of tcache.c 2026-03-10 18:14:33 -07:00
Shirui Cheng
2114349a4e Revert PR #2608: Manually revert commits 70c94d..f9c0b5
Closes: #2707
2026-03-10 18:14:33 -07:00
guangli-dai
6200e8987f Reformat the codebase with the clang-format 18. 2026-03-10 18:14:33 -07:00
Slobodan Predolac
015b017973 [thread_event] Add support for user events in thread events when stats are enabled 2026-03-10 18:14:33 -07:00
Slobodan Predolac
e6864c6075 [thread_event] Remove macros from thread_event and replace with dynamic event objects 2026-03-10 18:14:33 -07:00
guangli-dai
c067a55c79 Introducing a new usize calculation policy
Converting size to usize is what jemalloc has been done by ceiling
size to the closest size class. However, this causes lots of memory
wastes with HPA enabled.  This commit changes how usize is calculated so
that the gap between two contiguous usize is no larger than a page.
Specifically, this commit includes the following changes:

1. Adding a build-time config option (--enable-limit-usize-gap) and a
runtime one (limit_usize_gap) to guard the changes.
When build-time
config is enabled, some minor CPU overhead is expected because usize
will be stored and accessed apart from index.  When runtime option is
also enabled (it can only be enabled with the build-time config
enabled). a new usize calculation approach wil be employed.  This new
calculation will ceil size to the closest multiple of PAGE for all sizes
larger than USIZE_GROW_SLOW_THRESHOLD instead of using the size classes.
Note when the build-time config is enabled, the runtime option is
default on.

2. Prepare tcache for size to grow by PAGE over GROUP*PAGE.
To prepare for the upcoming changes where size class grows by PAGE when
larger than NGROUP * PAGE, disable the tcache when it is larger than 2 *
NGROUP * PAGE. The threshold for tcache is set higher to prevent perf
regression as much as possible while usizes between NGROUP * PAGE and 2 *
NGROUP * PAGE happen to grow by PAGE.

3. Prepare pac and hpa psset for size to grow by PAGE over GROUP*PAGE
For PAC, to avoid having too many bins, arena bins still have the same
layout.  This means some extra search is needed for a page-level request that
is not aligned with the orginal size class: it should also search the heap
before the current index since the previous heap might also be able to
have some allocations satisfying it.  The same changes apply to HPA's
psset.
This search relies on the enumeration of the heap because not all allocs in
the previous heap are guaranteed to satisfy the request.  To balance the
memory and CPU overhead, we currently enumerate at most a fixed number
of nodes before concluding none can satisfy the request during an
enumeration.

4. Add bytes counter to arena large stats.
To prepare for the upcoming usize changes, stats collected by
multiplying alive allocations and the bin size is no longer accurate.
Thus, add separate counters to record the bytes malloced and dalloced.

5. Change structs use when freeing to avoid using index2size for large sizes.
  - Change the definition of emap_alloc_ctx_t
  - Change the read of both from edata_t.
  - Change the assignment and usage of emap_alloc_ctx_t.
  - Change other callsites of index2size.
Note for the changes in the data structure, i.e., emap_alloc_ctx_t,
will be used when the build-time config (--enable-limit-usize-gap) is
enabled but they will store the same value as index2size(szind) if the
runtime option (opt_limit_usize_gap) is not enabled.

6. Adapt hpa to the usize changes.
Change the settings in sec to limit is usage for sizes larger than
USIZE_GROW_SLOW_THRESHOLD and modify corresponding tests.

7. Modify usize calculation and corresponding tests.
Change the sz_s2u_compute. Note sz_index2size is not always safe now
while sz_size2index still works as expected.
2025-03-06 15:08:13 -08:00
Shirui Cheng
7c99686165 Better handle burst allocation on tcache_alloc_small_hard 2024-08-29 10:50:33 -07:00
Shirui Cheng
0c88be9e0a Regulate GC frequency by requiring a time interval between two consecutive GCs 2024-08-29 10:50:33 -07:00
Shirui Cheng
e2c9f3a9ce Take locality into consideration when doing GC flush 2024-08-29 10:50:33 -07:00
Shirui Cheng
14d5dc136a Allow a range for the nfill passed to arena_cache_bin_fill_small 2024-08-29 10:50:33 -07:00
Qi Wang
bd0a5b0f3b Fix static analysis warnings.
Newly reported warnings included several reserved macro identifier, and
false-positive used-uninitialized.
2024-08-28 16:03:53 -07:00
Amaury Séchet
a25b9b8ba9 Simplify the logic when bumping lg_fill_div. 2024-08-06 13:31:49 -07:00
Shirui Cheng
47c9bcd402 Use a for-loop to fulfill flush requests that are larger than CACHE_BIN_NFLUSH_BATCH_MAX items 2024-08-06 13:16:09 -07:00
David Goldblatt
f9c0b5f7f8 Bin batching: add some stats.
This lets us easily see what fraction of flush load is being taken up by the
bins, and helps guide future optimization approaches (for example: should we
prefetch during cache bin fills? It depends on how many objects the average fill
pops out of the batch).
2024-05-22 10:30:31 -07:00
David Goldblatt
fc615739cb Add batching to arena bins.
This adds a fast-path for threads freeing a small number of allocations to
bins which are not their "home-base" and which encounter lock contention in
attempting to do so. In producer-consumer workflows, such small lock hold times
can cause lock convoying that greatly increases overall bin mutex contention.
2024-05-22 10:30:31 -07:00
David Goldblatt
44d91cf243 Tcache flush: Partition by bin before locking.
This accomplishes two things:
- It avoids a full array scan (and any attendant branch prediction misses, etc.)
  while holding the bin lock.
- It allows us to know the number of items that will be flushed before flushing
  them, which will (in an upcoming commit) let us know if it's safe to use the
  batched flush (in which case we won't acquire the bin mutex).
2024-05-22 10:30:31 -07:00
David Goldblatt
6e56848850 Tcache: Split up small/large handling.
The main bits of shared code are the edata filtering and the stats flushing
logic, both of which are fairly simple to read and not so painful to duplicate.
The shared code comes at the cost of guarding all the subtle logic with
`if (small)`, which doesn't feel worth it.
2024-05-22 10:30:31 -07:00
Amaury Séchet
5afff2e44e Simplify the logic in tcache_gc_small. 2024-05-02 18:52:19 -07:00
Qi Wang
fa451de17f Fix the tcache flush sanity checking around ncached and nstashed.
When there were many items stashed, it's possible that after flushing stashed,
ncached is already lower than the remain, in which case the flush can simply
return at that point.
2024-04-12 16:01:55 -07:00
guangli-dai
eda05b3994 Fix static analysis warnings. 2024-01-03 14:18:52 -08:00
Shirui Cheng
e4817c8d89 Cleanup cache_bin_info_t* info input args 2023-10-25 10:27:31 -07:00
guangli-dai
d88fa71bbd Fix nfill = 0 bug when ncached_max is 1 2023-10-18 14:11:46 -07:00
guangli-dai
6fb3b6a8e4 Refactor the tcache initiailization
1. Pre-generate all default tcache ncached_max in tcache_boot;
2. Add getters returning default ncached_max and ncached_max_set;
3. Refactor tcache init so that it is always init with a given setting.
2023-10-18 14:11:46 -07:00
guangli-dai
8a22d10b83 Allow setting default ncached_max for each bin through malloc_conf 2023-10-18 14:11:46 -07:00
guangli-dai
630f7de952 Add mallctl to set and get ncached_max of each cache_bin.
1. `thread_tcache_ncached_max_read_sizeclass` allows users to get the
    ncached_max of the bin with the input sizeclass, passed in through
    oldp (will be upper casted if not an exact bin size is given).
2. `thread_tcache_ncached_max_write` takes in a char array
    representing the settings for bins in the tcache.
2023-10-17 14:53:23 -07:00
guangli-dai
6b197fdd46 Pre-generate ncached_max for all bins for better tcache_max tuning experience. 2023-10-17 14:53:23 -07:00
Qi Wang
72cfdce718 Allocate tcache stack from base allocator
When using metadata_thp, allocate tcache bin stacks from base0, which means they
will be placed on huge pages along with other metadata, instead of mixed with
other regular allocations.

In order to do so, modified the base allocator to support limited reuse: freed
tcached stacks (from thread termination) will be returned to base0 and made
available for reuse, but no merging will be attempted since they were bump
allocated out of base blocks. These reused base extents are managed using
separately allocated base edata_t -- they are cached in base->edata_avail when
the extent is all allocated.

One tricky part is, stats updating must be skipped for such reused extents
(since they were accounted for already, and there is no purging for base). This
requires tracking the "if is reused" state explicitly and bypass the stats
updates when allocating from them.
2023-09-18 12:18:32 -07:00
guangli-dai
a442d9b895 Enable per-tcache tcache_max
1. add tcache_max and nhbins into tcache_t so that they are per-tcache,
   with one auto tcache per thread, it's also per-thread;
2. add mallctl for each thread to set its own tcache_max (of its auto tcache);
3. store the maximum number of items in each bin instead of using a global storage;
4. add tests for the modifications above.
5. Rename `nhbins` and `tcache_maxclass` to `global_do_not_change_nhbins` and `global_do_not_change_tcache_maxclass`.
2023-09-06 10:47:14 -07:00
guangli-dai
fbca96c433 Remove unnecessary parameters for cache_bin_postincrement. 2023-09-06 10:47:14 -07:00
Kevin Svetlitski
3e82f357bb Fix all optimization-inhibiting integer-to-pointer casts
Following from PR #2481, we replace all integer-to-pointer casts [which
hide pointer provenance information (and thus inhibit
optimizations)](https://clang.llvm.org/extra/clang-tidy/checks/performance/no-int-to-ptr.html)
with equivalent operations that preserve this information. I have
enabled the corresponding clang-tidy check in our static analysis CI so
that we do not get bitten by this again in the future.
2023-07-24 14:40:42 -07:00
Qi Wang
f509703af5 Fix two conversion warnings in tcache. 2022-01-04 13:55:06 -08:00
Qi Wang
8b34a788b5 Fix an used-uninitialized warning (false positive). 2021-12-29 14:44:43 -08:00
Qi Wang
e491cef9ab Add stats for stashed bytes in tcache. 2021-12-29 14:44:43 -08:00
Qi Wang
b75822bc6e Implement use-after-free detection using junk and stash.
On deallocation, sampled pointers (specially aligned) get junked and stashed
into tcache (to prevent immediate reuse).  The expected behavior is to have
read-after-free corrupted and stopped by the junk-filling, while
write-after-free is checked when flushing the stashed pointers.
2021-12-29 14:44:43 -08:00
Qi Wang
06aac61c4b Split the core logic of tcache flush into a separate function.
The core function takes a ptr array as input (containing items to be flushed),
which will be reused to flush sanitizer-stashed items.
2021-12-29 14:44:43 -08:00
Qi Wang
041145c272 Report the correct and wrong sizes on sized dealloc bug detection. 2021-02-08 14:42:27 -08:00
Qi Wang
f3b2668b32 Report the offending pointer on sized dealloc bug detection. 2021-02-08 14:42:27 -08:00
David Goldblatt
20140629b4 Bin: Move stats closer to the mutex.
This is a slight cache locality optimization.
2021-02-04 14:10:43 -08:00
David Goldblatt
3967329813 Arena: share bin offsets in a global.
This saves us a cache miss when lookup up the arena bin offset in a remote
arena during tcache flush.  All arenas share the base offset, and so we don't
need to look it up repeatedly for each arena.  Secondarily, it shaves 288 bytes
off the arena on, e.g., x86-64.
2021-02-04 14:10:43 -08:00
David Goldblatt
2fcbd18115 Cache bin: Don't reverse flush order.
The items we pick to flush matter a lot, but the order in which they get flushed
doesn't; just use forward scans.  This simplifies the accessing code, both in
terms of the C and the generated assembly (i.e. this speeds up the flush
pathways).
2021-02-04 14:10:43 -08:00
David Goldblatt
229994a204 Tcache flush: keep common path state in registers.
By carefully force-inlining the division constants and the operation sum count,
we can eliminate redundant operations in the arena-level dalloc function.  Do
so.
2021-02-04 14:10:43 -08:00
David Goldblatt
31a629c3de Tcache flush: prefetch edata contents.
This frontloads more of the miss latency.  It also moves it to a pathway where
we have not yet acquired any locks, so that it should (hopefully) reduce hold
times.
2021-02-04 14:10:43 -08:00
David Goldblatt
9f9247a62e Tcache fluhing: increase cache miss parallelism.
In practice, many rtree_leaf_elm accesses are cache misses.  By restructuring,
we can make it more likely that these misses occur without blocking us from
starting later lookups, taking more of those misses in parallel.
2021-02-04 14:10:43 -08:00
David Goldblatt
181ba7fd4d Tcache flush: Add an emap "batch lookup" path.
For now this is a no-op; but the interface is a little more flexible for our
purposes.
2021-02-04 14:10:43 -08:00
David Goldblatt
c007c537ff Tcache flush: Unify edata lookup path. 2021-02-04 14:10:43 -08:00
David Goldblatt
a011c4c22d cache_bin: Separate out local and remote accesses.
This fixes an incorrect debug-mode assert:
- T1 starts an arena stats update and reads stack_head from another thread's
  cache bin, when that cache bin has 1 item in it.
- T2 allocates from that cache bin.  The cache_bin's stack_head now points to a
  NULL pointer, since the cache bin is empty.
- T1 Re-reads the cache_bin's stack_head to perform an assertion check (since it
  previously saw that the bin was empty, whatever stack_head points to should be
  non-NULL).
2021-01-08 14:18:08 -08:00
Qi Wang
bf72188f80 Allow opt.tcache_max to accept small size classes.
Previously all the small size classes were cached.  However this has downsides
-- particularly when page size is greater than 4K (e.g. iOS), which will result
in much higher SMALL_MAXCLASS.

This change allows tcache_max to be set to lower values, to better control
resources taken by tcache.
2020-10-24 20:43:44 -07:00