romenskiy2012/jemalloc

mirror of https://github.com/jemalloc/jemalloc.git synced 2026-05-15 01:16:23 +03:00

Author	SHA1	Message	Date
Slobodan Predolac	35d102fa32	Encapsulate cache_bin_array_descriptor queue ops behind arena helpers tcache.c was reaching into arena->cache_bin_array_descriptor_ql{,_mtx} directly to register / unregister / postfork-relink its descriptor. That queue and mutex are owned by arena, so the locking and ql_* operations belong in arena.c.	2026-05-13 17:50:41 -04:00
Slobodan Predolac	b92420d309	Replace arena->tcache_ql with cache_bin_array_descriptor_ql walks Drop the duplicate arena->tcache_ql; stats merging walks the cache_bin_array_descriptor_ql directly. Rename the protecting mutex from tcache_ql_mtx to cache_bin_array_descriptor_ql_mtx to match. Add an assertion in test_thread_migrate_arena that the dissociate-time flush zeros cache_bin->tstats.nrequests.	2026-05-13 17:50:41 -04:00
Slobodan Predolac	54ef51121b	Extract postfork-child tcache list relink into tcache_arena_postfork_child	2026-05-13 17:50:41 -04:00
Slobodan Predolac	b6cfaa4fe2	Extract large-cacheable tcache check into tcache_can_cache_large	2026-05-13 17:50:41 -04:00
Slobodan Predolac	3c1c6ae419	Hide bin slab-locality query behind arena_locality_hint bin_t is an arena implementation detail; tcache should not reach into it. Extract the slab-address lookup into bin.c as bin_current_slab_addr, and expose it to tcache only through arena_locality_hint(tsdn, arena, szind), which composes bin_choose + bin_current_slab_addr.	2026-05-13 17:50:41 -04:00
Bin Liu	be2de8ccd8	Introduce pinned extents to contain unpurgeable pages Some pages (e.g., hugetlb pages) cannot be purged, and should be prioritized for reuse. A custom extent_alloc hook signals this by OR'ing EXTENT_ALLOC_FLAG_PINNED into the low bits of the returned pointer; jemalloc strips the flag bits and caches pinned extents in a dedicated ecache_pinned, separate from the dirty/muzzy decay pipeline. Pinned extents do not coalesce eagerly, except for ones larger than SC_LARGE_MINCLASS. A prefer-small policy reuses the smallest fitting pinned extent, to avoid unnecessary split/fragmentation.	2026-05-05 10:44:28 -07:00
Slobodan Predolac	7638093c73	Improve const correctness in the repo	2026-05-04 11:56:17 -07:00
Slobodan Predolac	9ae0c1b6d3	Make huge_arena_ind static within the arena module (Follow up for PR #2904 )	2026-05-01 13:24:45 -04:00
guangli-dai	ee4d7b7f9a	Update a0's oversize threshold regardless whether huge arena is enabled or not.	2026-04-29 13:25:29 -07:00
Slobodan Predolac	6cd31c0985	Fix several typos in the comments	2026-04-22 13:35:18 -04:00
Carl Shapiro	0ac9380cf1	Move bin inline functions from arena_inlines_b.h to bin_inlines.h This is a continuation of my previous clean-up change, now focusing on the inline functions defined in header files.	2026-03-10 18:14:33 -07:00
Carl Shapiro	1cc563f531	Move bin functions from arena.c to bin.c This is a clean-up change that gives the bin functions implemented in the area code a prefix of bin_ and moves them into the bin code. To further decouple the bin code from the arena code, bin functions that had taken an arena_t to check arena_is_auto now take an is_auto parameter instead.	2026-03-10 18:14:33 -07:00
Slobodan Predolac	6016d86c18	[SEC] Make SEC owned by hpa_shard, simplify the code, add stats, lock per bin	2026-03-10 18:14:33 -07:00
Shirui Cheng	6d4611197e	move fill/flush pointer array out of tcache.c	2026-03-10 18:14:33 -07:00
guangli-dai	2cfa41913e	Refactor init_system_thp_mode and print it in malloc stats.	2026-03-10 18:14:33 -07:00
Shirui Cheng	2114349a4e	Revert PR #2608 : Manually revert commits 70c94d..f9c0b5 Closes: #2707	2026-03-10 18:14:33 -07:00
guangli-dai	6200e8987f	Reformat the codebase with the clang-format 18.	2026-03-10 18:14:33 -07:00
Guangli Dai	01e9ecbeb2	Remove build-time configuration 'config_limit_usize_gap'	2025-05-06 14:47:35 -07:00
Shirui Cheng	3688dfb5c3	fix assertion error in huge_arena_auto_thp_switch() when b0 is deleted in unit test	2025-03-20 12:45:23 -07:00
Shirui Cheng	e1a77ec558	Support THP with Huge Arena in PAC	2025-03-17 16:06:43 -07:00
guangli-dai	c067a55c79	Introducing a new usize calculation policy Converting size to usize is what jemalloc has been done by ceiling size to the closest size class. However, this causes lots of memory wastes with HPA enabled. This commit changes how usize is calculated so that the gap between two contiguous usize is no larger than a page. Specifically, this commit includes the following changes: 1. Adding a build-time config option (--enable-limit-usize-gap) and a runtime one (limit_usize_gap) to guard the changes. When build-time config is enabled, some minor CPU overhead is expected because usize will be stored and accessed apart from index. When runtime option is also enabled (it can only be enabled with the build-time config enabled). a new usize calculation approach wil be employed. This new calculation will ceil size to the closest multiple of PAGE for all sizes larger than USIZE_GROW_SLOW_THRESHOLD instead of using the size classes. Note when the build-time config is enabled, the runtime option is default on. 2. Prepare tcache for size to grow by PAGE over GROUPPAGE. To prepare for the upcoming changes where size class grows by PAGE when larger than NGROUP PAGE, disable the tcache when it is larger than 2 * NGROUP * PAGE. The threshold for tcache is set higher to prevent perf regression as much as possible while usizes between NGROUP * PAGE and 2 * NGROUP * PAGE happen to grow by PAGE. 3. Prepare pac and hpa psset for size to grow by PAGE over GROUP*PAGE For PAC, to avoid having too many bins, arena bins still have the same layout. This means some extra search is needed for a page-level request that is not aligned with the orginal size class: it should also search the heap before the current index since the previous heap might also be able to have some allocations satisfying it. The same changes apply to HPA's psset. This search relies on the enumeration of the heap because not all allocs in the previous heap are guaranteed to satisfy the request. To balance the memory and CPU overhead, we currently enumerate at most a fixed number of nodes before concluding none can satisfy the request during an enumeration. 4. Add bytes counter to arena large stats. To prepare for the upcoming usize changes, stats collected by multiplying alive allocations and the bin size is no longer accurate. Thus, add separate counters to record the bytes malloced and dalloced. 5. Change structs use when freeing to avoid using index2size for large sizes. - Change the definition of emap_alloc_ctx_t - Change the read of both from edata_t. - Change the assignment and usage of emap_alloc_ctx_t. - Change other callsites of index2size. Note for the changes in the data structure, i.e., emap_alloc_ctx_t, will be used when the build-time config (--enable-limit-usize-gap) is enabled but they will store the same value as index2size(szind) if the runtime option (opt_limit_usize_gap) is not enabled. 6. Adapt hpa to the usize changes. Change the settings in sec to limit is usage for sizes larger than USIZE_GROW_SLOW_THRESHOLD and modify corresponding tests. 7. Modify usize calculation and corresponding tests. Change the sz_s2u_compute. Note sz_index2size is not always safe now while sz_size2index still works as expected.	2025-03-06 15:08:13 -08:00
Dmitry Ilvokhin	499f306859	Fix arena 0 `deferral_allowed` flag init Arena 0 have a dedicated initialization path, which differs from initialization path of other arenas. The main difference for the purpose of this change is that we initialize arena 0 before we initialize background threads. HPA shard options have `deferral_allowed` flag which should be equal to `background_thread_enabled()` return value, but it wasn't the case before this change, because for arena 0 `background_thread_enabled()` was initialized correctly after arena 0 initialization phase already ended. Below is initialization sequence for arena 0 after this commit to illustrate everything still should be initialized correctly. * `hpa_central_init` initializes HPA Central, before we initialize every HPA shard (including arena's 0). * `background_thread_boot1` initializes `background_thread_enabled()` return value. * `pa_shard_enable_hpa` initializes arena 0 HPA shard. ``` malloc_init_hard ------------- / / \ / / \ / / \ malloc_init_hard_a0_locked background_thread_boot1 pa_shard_enable_hpa / / \ / / \ / / \ arena_boot background_thread_enabled_seta hpa_shard_init \| \| pa_central_init \| \| hpa_central_init ```	2025-02-18 12:10:35 -08:00
Shirui Cheng	14d5dc136a	Allow a range for the nfill passed to arena_cache_bin_fill_small	2024-08-29 10:50:33 -07:00
Qi Wang	bd0a5b0f3b	Fix static analysis warnings. Newly reported warnings included several reserved macro identifier, and false-positive used-uninitialized.	2024-08-28 16:03:53 -07:00
David Goldblatt	fc615739cb	Add batching to arena bins. This adds a fast-path for threads freeing a small number of allocations to bins which are not their "home-base" and which encounter lock contention in attempting to do so. In producer-consumer workflows, such small lock hold times can cause lock convoying that greatly increases overall bin mutex contention.	2024-05-22 10:30:31 -07:00
David Goldblatt	c085530c71	Tcache batching: Plumbing In the next commit, we'll start using the batcher to eliminate mutex traffic. To avoid cluttering up that commit with the random bits of busy-work it entails, we'll centralize them here. This commit introduces: - A batched bin type. - The ability to mix batched and unbatched bins in the arena. - Conf parsing to set batches per size and a max batched size. - mallctl access to the corresponding opt-namespace keys. - Stats output of the above.	2024-05-22 10:30:31 -07:00
Qi Wang	8d8379da44	Fix background_thread creation for the oversize_arena. Bypassing background thread creation for the oversize_arena used to be an optimization since that arena had eager purging. However #2466 changed the purging policy for the oversize_arena -- specifically it switched to the default decay time when background_thread is enabled. This issue is noticable when the number of arenas is low: whenever the total # of arenas is <= 4 (which is the default max # of background threads), in which case the purging will be stalled since no background thread is created for the oversize_arena.	2024-05-02 14:45:18 -07:00
Shirui Cheng	5081c16bb4	Experimental calloc implementation with using memset on larger sizes	2024-04-04 15:31:56 -07:00
guangli-dai	eda05b3994	Fix static analysis warnings.	2024-01-03 14:18:52 -08:00
Shirui Cheng	e4817c8d89	Cleanup cache_bin_info_t* info input args	2023-10-25 10:27:31 -07:00
Qi Wang	3025b021b9	Optimize mutex and bin alignment / locality.	2023-10-23 20:28:26 -07:00
guangli-dai	d88fa71bbd	Fix nfill = 0 bug when ncached_max is 1	2023-10-18 14:11:46 -07:00
guangli-dai	6b197fdd46	Pre-generate ncached_max for all bins for better tcache_max tuning experience.	2023-10-17 14:53:23 -07:00
Shirui Cheng	36becb1302	metadata usage breakdowns: tracking edata and rtree usages	2023-10-11 11:56:01 -07:00
guangli-dai	a442d9b895	Enable per-tcache tcache_max 1. add tcache_max and nhbins into tcache_t so that they are per-tcache, with one auto tcache per thread, it's also per-thread; 2. add mallctl for each thread to set its own tcache_max (of its auto tcache); 3. store the maximum number of items in each bin instead of using a global storage; 4. add tests for the modifications above. 5. Rename `nhbins` and `tcache_maxclass` to `global_do_not_change_nhbins` and `global_do_not_change_tcache_maxclass`.	2023-09-06 10:47:14 -07:00
Kevin Svetlitski	424dd61d57	Issue a warning upon directly accessing an arena's bins An arena's bins should normally be accessed via the `arena_get_bin` function, which properly takes into account bin-shards. To ensure that we don't accidentally commit code which incorrectly accesses the bins directly, we mark the field with `__attribute__((deprecated))` with an appropriate warning message, and suppress the warning in the few places where directly accessing the bins is allowed.	2023-08-04 15:47:05 -07:00
Kevin Svetlitski	07a2eab3ed	Stop over-reporting memory usage from sampled small allocations @interwq noticed [while reviewing an earlier PR](https://github.com/jemalloc/jemalloc/pull/2478#discussion_r1256217261) that I missed modifying this statistics accounting in line with the rest of the changes from #2459. This is now fixed, such that sampled small allocations increment the `.nmalloc`/`.ndalloc` of their effective bin size instead of over-reporting memory usage by attributing all such allocations to `SC_LARGE_MINCLASS`.	2023-08-03 16:12:22 -07:00
Kevin Svetlitski	62648c88e5	Ensured sampled allocations are properly deallocated during `arena_reset` Sampled allocations were not being demoted before being deallocated during an `arena_reset` operation.	2023-08-01 11:35:37 -07:00
Kevin Svetlitski	3e82f357bb	Fix all optimization-inhibiting integer-to-pointer casts Following from PR #2481, we replace all integer-to-pointer casts [which hide pointer provenance information (and thus inhibit optimizations)](https://clang.llvm.org/extra/clang-tidy/checks/performance/no-int-to-ptr.html) with equivalent operations that preserve this information. I have enabled the corresponding clang-tidy check in our static analysis CI so that we do not get bitten by this again in the future.	2023-07-24 14:40:42 -07:00
Kevin Svetlitski	589c63b424	Make eligible global variables `static` and/or `const` For better or worse, Jemalloc has a significant number of global variables. Making all eligible global variables `static` and/or `const` at least makes it slightly easier to reason about them, as these qualifications communicate to the programmer restrictions on their use without having to `grep` the whole codebase.	2023-07-06 14:15:12 -07:00
Kevin Svetlitski	5a858c64d6	Reduce the memory overhead of sampled small allocations Previously, small allocations which were sampled as part of heap profiling were rounded up to `SC_LARGE_MINCLASS`. This additional memory usage becomes problematic when the page size is increased, as noted in #2358. Small allocations are now rounded up to the nearest multiple of `PAGE` instead, reducing the memory overhead by a factor of 4 in the most extreme cases.	2023-07-03 16:19:06 -07:00
Qi Wang	d131331310	Avoid eager purging on the dedicated oversize arena when using bg thds. We have observed new workload patterns (namely ML training type) that cycle through oversized allocations frequently, because 1) the dataset might be sparse which is faster to go through, and 2) GPU accelerated. As a result, the eager purging from the oversize arena becomes a bottleneck. To offer an easy solution, allow normal purging of the oversized extents when background threads are enabled.	2023-06-27 11:57:41 -07:00
Qi Wang	86eb49b478	Fix the arena selection for oversized allocations. Use the per-arena oversize_threshold, instead of the global setting.	2023-06-06 15:03:13 -07:00
Kevin Svetlitski	fc680128e0	Remove errant `assert` in `arena_extent_alloc_large` This codepath may generate deferred work when the HPA is enabled. See also [@davidtgoldblatt's relevant comment on the PR which introduced this](https://github.com/jemalloc/jemalloc/pull/2107#discussion_r699770967) which prevented a similarly incorrect `assert` from being added elsewhere.	2023-05-01 10:00:30 -07:00
Qi Wang	b6125120ac	Add an explicit name to the dedicated oversize arena.	2023-02-17 13:31:09 -08:00
Guangli Dai	ba19d2cb78	Add arena-level name. An arena-level name can help identify manual arenas.	2022-09-16 15:04:59 -07:00
Azat Khuzhin	cb578bbe01	Fix possible "nmalloc >= ndalloc" assertion In arena_stats_merge() first nmalloc was read, and after ndalloc. However with this order, it is possible for some thread to incement ndalloc in between, and then nmalloc < ndalloc, and assertion will fail, like again found by ClickHouse CI [1] (even after #2234). [1]: https://github.com/ClickHouse/ClickHouse/issues/31531 Swap the order to avoid possible assertion. Cc: @interwq Follow-up for: #2234	2022-07-11 15:27:51 -07:00
Azat Khuzhin	78b58379c8	Fix possible "nmalloc >= ndalloc" assertion. It is possible that ndalloc will be updated before nmalloc, in arena_large_ralloc_stats_update(), fix this by reorder those calls. It was found by ClickHouse CI, that periodically hits this assertion [1]. [1]: https://github.com/ClickHouse/ClickHouse/issues/31531 That issue contains lots of examples, with core dump and some gdb output [2]. [2]: https://s3.amazonaws.com/clickhouse-test-reports/34951/96390a9263cb5af3d6e42a84988239c9ae87ce32/stress_test__debug__actions_.html Here you can find binaries for that particular report [3] you need clickhouse debug build [4]. [3]: https://s3.amazonaws.com/clickhouse-builds/34951/96390a9263cb5af3d6e42a84988239c9ae87ce32/clickhouse_build_check_(actions)/report.html [4]: https://s3.amazonaws.com/clickhouse-builds/34951/96390a9263cb5af3d6e42a84988239c9ae87ce32/package_debug/clickhouse Brief info from that report: 2 0x000000002ad6dbfe in arena_stats_merge (tsdn=0x7f2399abdd20, arena=0x7f241ce01080, nthreads=0x7f24e4360958, dss=0x7f24e4360960, dirty_decay_ms=0x7f24e4360968, muzzy_decay_ms=0x7f24e4360970, nactive=0x7f24e4360978, ndirty=0x7f24e43 e4360988, astats=0x7f24e4360998, bstats=0x7f24e4363310, lstats=0x7f24e4364990, estats=0x7f24e4366e50, hpastats=0x7f24e43693a0, secstats=0x7f24e436a020) at ../contrib/jemalloc/src/arena.c:138 ndalloc = 226 nflush = 0 curlextents = 0 nmalloc = 225 nrequests = 0 Here you can see that they differs only by 1. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-03-01 12:28:28 -08:00
Qi Wang	e491cef9ab	Add stats for stashed bytes in tcache.	2021-12-29 14:44:43 -08:00
Qi Wang	b75822bc6e	Implement use-after-free detection using junk and stash. On deallocation, sampled pointers (specially aligned) get junked and stashed into tcache (to prevent immediate reuse). The expected behavior is to have read-after-free corrupted and stopped by the junk-filling, while write-after-free is checked when flushing the stashed pointers.	2021-12-29 14:44:43 -08:00

1 2 3 4 5 ...

564 commits