romenskiy2012/jemalloc

mirror of https://github.com/jemalloc/jemalloc.git synced 2026-06-03 10:44:16 +03:00

Author	SHA1	Message	Date
Slobodan Predolac	ca77aca653	Split ctl handlers by mallctl namespace Move mallctl handler implementations out of src/ctl.c into namespace-oriented ctl_* modules, while keeping the mallctl tree and dispatch machinery centralized in ctl.c. Add shared ctl mallctl helper macros and thin ctl arena/stat interfaces needed by the split modules. Wire the new sources into Makefile.in and MSVC project files. Keep behavior unchanged; this is intended as a readability and navigation refactor.	2026-05-28 19:11:30 -07:00
Slobodan Predolac	44bb61e19e	Remove utilization query mallctl	2026-05-28 10:54:42 -07:00
Slobodan Predolac	39d4b20890	Remove pactivep mallctl	2026-05-28 10:52:10 -07:00
Slobodan Predolac	c9bbaff123	Deduplicate arena create ctl	2026-05-28 08:58:06 -07:00
Slobodan Predolac	9cecc7bfa7	Remove safety check abort mallctl	2026-05-28 08:58:06 -07:00
Slobodan Predolac	bbe86b591f	Deduplicate prof hook ctl handlers	2026-05-28 08:58:06 -07:00
Slobodan Predolac	366a2cd9c0	Internalize malloc_conf_2_conf_harder ctl	2026-05-28 08:58:06 -07:00
Slobodan Predolac	533ffb67e3	Refactor arena ctl helpers	2026-05-28 08:58:06 -07:00
Slobodan Predolac	78cbeaf8a4	Remove batch_alloc mallctl	2026-05-22 23:34:10 -07:00
Tony Printezis	f008ce9fe1	Remove hpa_sec_batch_fill_extra and calculate nallocs automatically. This change includes the following improvements: - Remove the hpa_sec_batch_fill_extra parameter. - Refactor the hpa_alloc() code and helper functions to be able to allocate more than one extent out of a single pageslab. This way we can amortize the per-pageslab costs (active bitmap iteration, pageslab metadata updates) across multiple extents. - Decide on a min and max number of extents that will be allocated in hpa_alloc(). The code will try to allocate at least the min and allocate up to the max as long as we can allocate additional ones from the pageslab we already have, as additional allocations are relatively cheap. - Add extent allocation distribution stats. - Amend hpa_sec_integration.c unit test.	2026-05-14 11:00:33 -07:00
Slobodan Predolac	ff2c2548a3	Remove generic experimental hooks	2026-05-13 18:27:43 -04:00
Slobodan Predolac	b92420d309	Replace arena->tcache_ql with cache_bin_array_descriptor_ql walks Drop the duplicate arena->tcache_ql; stats merging walks the cache_bin_array_descriptor_ql directly. Rename the protecting mutex from tcache_ql_mtx to cache_bin_array_descriptor_ql_mtx to match. Add an assertion in test_thread_migrate_arena that the dissociate-time flush zeros cache_bin->tstats.nrequests.	2026-05-13 17:50:41 -04:00
Slobodan Predolac	f9c84860e0	Fold tcache reassociation into thread_migrate_arena helper	2026-05-13 17:50:41 -04:00
Christoph Grüninger	5acdcee600	Remove extra semicolon after macros There is no compatible way to consume the semicolon by the macro. Found by GCC 16 (pedantic).	2026-05-08 11:39:56 -07:00
Bin Liu	be2de8ccd8	Introduce pinned extents to contain unpurgeable pages Some pages (e.g., hugetlb pages) cannot be purged, and should be prioritized for reuse. A custom extent_alloc hook signals this by OR'ing EXTENT_ALLOC_FLAG_PINNED into the low bits of the returned pointer; jemalloc strips the flag bits and caches pinned extents in a dedicated ecache_pinned, separate from the dirty/muzzy decay pipeline. Pinned extents do not coalesce eagerly, except for ones larger than SC_LARGE_MINCLASS. A prefer-small policy reuses the smallest fitting pinned extent, to avoid unnecessary split/fragmentation.	2026-05-05 10:44:28 -07:00
Slobodan Predolac	eab2b29736	Fix off-by-one in stats_arenas_i_bins_j and stats_arenas_i_lextents_j bounds checks Same pattern as arenas_bin_i_index: used > instead of >= allowing access one past the end of bstats[] and lstats[] arrays. Add unit tests that verify boundary indices return ENOENT.	2026-04-01 23:15:19 -04:00
Slobodan Predolac	513778bcb1	Fix off-by-one in arenas_bin_i_index and arenas_lextent_i_index bounds checks The index validation used > instead of >=, allowing access at index SC_NBINS (for bins) and SC_NSIZES-SC_NBINS (for lextents), which are one past the valid range. This caused out-of-bounds reads in bin_infos[] and sz_index2size_unsafe(). Add unit tests that verify the boundary indices return ENOENT.	2026-04-01 23:15:19 -04:00
Slobodan Predolac	176ea0a801	Remove experimental.thread.activity_callback	2026-04-01 16:23:41 -07:00
Slobodan Predolac	34ace9169b	Remove prof_threshold built-in event. It is trivial to implement it as user event if needed	2026-03-10 18:14:33 -07:00
Slobodan Predolac	6016d86c18	[SEC] Make SEC owned by hpa_shard, simplify the code, add stats, lock per bin	2026-03-10 18:14:33 -07:00
Guangli Dai	0988583d7c	Add a mallctl for users to get an approximate of active bytes.	2026-03-10 18:14:33 -07:00
Slobodan Predolac	47aeff1d08	Add experimental_enforce_hugify	2026-03-10 18:14:33 -07:00
Slobodan Predolac	3678a57c10	When extracting from central, hugify_eager is different than start_as_huge	2026-03-10 18:14:33 -07:00
Slobodan Predolac	a199278f37	[HPA] Add ability to start page as huge and more flexibility for purging	2026-03-10 18:14:33 -07:00
Shirui Cheng	2114349a4e	Revert PR #2608 : Manually revert commits 70c94d..f9c0b5 Closes: #2707	2026-03-10 18:14:33 -07:00
guangli-dai	6200e8987f	Reformat the codebase with the clang-format 18.	2026-03-10 18:14:33 -07:00
Slobodan Predolac	015b017973	[thread_event] Add support for user events in thread events when stats are enabled	2026-03-10 18:14:33 -07:00
Jason Evans	27d7960cf9	Revert "Extend purging algorithm with peak demand tracking" This reverts commit `ad108d50f1`.	2025-06-02 10:44:37 -07:00
guangli-dai	8347f1045a	Renaming limit_usize_gap to disable_large_size_classes	2025-05-06 14:47:35 -07:00
Guangli Dai	01e9ecbeb2	Remove build-time configuration 'config_limit_usize_gap'	2025-05-06 14:47:35 -07:00
Shirui Cheng	e1a77ec558	Support THP with Huge Arena in PAC	2025-03-17 16:06:43 -07:00
Dmitry Ilvokhin	ad108d50f1	Extend purging algorithm with peak demand tracking Implementation inspired by idea described in "Beyond malloc efficiency to fleet efficiency: a hugepage-aware memory allocator" paper [1]. Primary idea is to track maximum number (peak) of active pages in use with sliding window and then use this number to decide how many dirty pages we would like to keep. We are trying to estimate maximum amount of active memory we'll need in the near future. We do so by projecting future active memory demand (based on peak active memory usage we observed in the past within sliding window) and adding slack on top of it (an overhead is reasonable to have in exchange of higher hugepages coverage). When peak demand tracking is off, projection of future active memory is active memory we are having right now. Estimation is essentially the same as `nactive_max * (1 + dirty_mult)`. Peak demand purging algorithm controlled by two config options. Option `hpa_peak_demand_window_ms` controls duration of sliding window we track maximum active memory usage in and option `hpa_dirty_mult` controls amount of slack we are allowed to have as a percent from maximum active memory usage. By default `hpa_peak_demand_window_ms == 0` now and we have same behaviour (ratio based purging) that we had before this commit. [1]: https://storage.googleapis.com/gweb-research2023-media/pubtools/6170.pdf	2025-03-13 10:12:22 -07:00
Qi Wang	22440a0207	Implement process_madvise support. Add opt.process_madvise_max_batch which determines if process_madvise is enabled (non-zero) and the max # of regions in each batch. Added another limiting factor which is the space to reserve on stack, which results in the max batch of 128.	2025-03-07 15:32:32 -08:00
guangli-dai	c067a55c79	Introducing a new usize calculation policy Converting size to usize is what jemalloc has been done by ceiling size to the closest size class. However, this causes lots of memory wastes with HPA enabled. This commit changes how usize is calculated so that the gap between two contiguous usize is no larger than a page. Specifically, this commit includes the following changes: 1. Adding a build-time config option (--enable-limit-usize-gap) and a runtime one (limit_usize_gap) to guard the changes. When build-time config is enabled, some minor CPU overhead is expected because usize will be stored and accessed apart from index. When runtime option is also enabled (it can only be enabled with the build-time config enabled). a new usize calculation approach wil be employed. This new calculation will ceil size to the closest multiple of PAGE for all sizes larger than USIZE_GROW_SLOW_THRESHOLD instead of using the size classes. Note when the build-time config is enabled, the runtime option is default on. 2. Prepare tcache for size to grow by PAGE over GROUPPAGE. To prepare for the upcoming changes where size class grows by PAGE when larger than NGROUP PAGE, disable the tcache when it is larger than 2 * NGROUP * PAGE. The threshold for tcache is set higher to prevent perf regression as much as possible while usizes between NGROUP * PAGE and 2 * NGROUP * PAGE happen to grow by PAGE. 3. Prepare pac and hpa psset for size to grow by PAGE over GROUP*PAGE For PAC, to avoid having too many bins, arena bins still have the same layout. This means some extra search is needed for a page-level request that is not aligned with the orginal size class: it should also search the heap before the current index since the previous heap might also be able to have some allocations satisfying it. The same changes apply to HPA's psset. This search relies on the enumeration of the heap because not all allocs in the previous heap are guaranteed to satisfy the request. To balance the memory and CPU overhead, we currently enumerate at most a fixed number of nodes before concluding none can satisfy the request during an enumeration. 4. Add bytes counter to arena large stats. To prepare for the upcoming usize changes, stats collected by multiplying alive allocations and the bin size is no longer accurate. Thus, add separate counters to record the bytes malloced and dalloced. 5. Change structs use when freeing to avoid using index2size for large sizes. - Change the definition of emap_alloc_ctx_t - Change the read of both from edata_t. - Change the assignment and usage of emap_alloc_ctx_t. - Change other callsites of index2size. Note for the changes in the data structure, i.e., emap_alloc_ctx_t, will be used when the build-time config (--enable-limit-usize-gap) is enabled but they will store the same value as index2size(szind) if the runtime option (opt_limit_usize_gap) is not enabled. 6. Adapt hpa to the usize changes. Change the settings in sec to limit is usage for sizes larger than USIZE_GROW_SLOW_THRESHOLD and modify corresponding tests. 7. Modify usize calculation and corresponding tests. Change the sz_s2u_compute. Note sz_index2size is not always safe now while sz_size2index still works as expected.	2025-03-06 15:08:13 -08:00
Shai Duvdevani	257e64b968	Unlike `prof_sample` which is supported only with profiling mode active, `prof_threshold` is intended to be an always-supported allocation callback with much less overhead. The usage of the threshold allows performance critical callers to change program execution based on the callback: e.g. drop caches when memory becomes high or to predict the program is about to OOM ahead of time using peak memory watermarks.	2025-01-29 18:55:52 -08:00
Qi Wang	607b866035	Check for 0 input when setting max_background_thread through mallctl. Reported by @nc7s.	2025-01-28 10:38:56 -08:00
Dmitry Ilvokhin	6092c980a6	Expose `psset` state stats When evaluating changes in HPA logic, it is useful to know internal `hpa_shard` state. Great deal of this state is `psset`. Some of the `psset` stats was available, but in disaggregated form, which is not very convenient. This commit exposed `psset` counters to `mallctl` and malloc stats dumps. Example of how malloc stats dump will look like after the change. HPA shard stats: Pageslabs: 14899 (4354 huge, 10545 nonhuge) Active pages: 6708166 (2228917 huge, 4479249 nonhuge) Dirty pages: 233816 (331 huge, 233485 nonhuge) Retained pages: 686306 Purge passes: 8730 (10 / sec) Purges: 127501 (146 / sec) Hugeifies: 4358 (5 / sec) Dehugifies: 4 (0 / sec) Pageslabs, active pages, dirty pages and retained pages are rows added by this change.	2024-11-21 09:23:32 -08:00
Dmitry Ilvokhin	0ce13c6fb5	Add opt `hpa_hugify_sync` to hugify synchronously Linux 6.1 introduced `MADV_COLLAPSE` flag to perform a best-effort synchronous collapse of the native pages mapped by the memory range into transparent huge pages. Synchronous hugification might be beneficial for at least two reasons: we are not relying on khugepaged anymore and get an instant feedback if range wasn't hugified. If `hpa_hugify_sync` option is on, we'll try to perform synchronously collapse and if it wasn't successful, we'll fallback to asynchronous behaviour.	2024-11-20 10:52:52 -08:00
Nathan Slingerland	edc1576f03	Add safe frame-pointer backtrace unwinder	2024-10-01 11:01:56 -07:00
Dmitry Ilvokhin	4f4fd42447	Remove `strict_min_purge_interval` option Option `experimental_hpa_strict_min_purge_interval` was expected to be temporary to simplify rollout of a bugfix. Now, when bugfix rollout is complete it is safe to remove this option.	2024-09-25 11:49:18 -07:00
Shirui Cheng	f68effe4ac	Add a runtime option opt_experimental_tcache_gc to guard the new design	2024-08-29 10:50:33 -07:00
Qi Wang	bd0a5b0f3b	Fix static analysis warnings. Newly reported warnings included several reserved macro identifier, and false-positive used-uninitialized.	2024-08-28 16:03:53 -07:00
Dmitry Ilvokhin	c7ccb8d7e9	Add `experimental` prefix to `hpa_strict_min_purge_interval` Goal is to make it obvious this option is experimental.	2024-08-20 10:02:38 -07:00
Dmitry Ilvokhin	aaa29003ab	Limit maximum number of purged slabs with option Option `experimental_hpa_max_purge_nhp` introduced for backward compatibility reasons: to make it possible to have behaviour similar to buggy `hpa_strict_min_purge_interval` implementation. When `experimental_hpa_max_purge_nhp` is set to -1, there is no limit to number of slabs we'll purge on each iteration. Otherwise, we'll purge no more than `experimental_hpa_max_purge_nhp` hugepages (slabs). This in turn means we might not purge enough dirty pages to satisfy `hpa_dirty_mult` requirement. Combination of `hpa_dirty_mult`, `experimental_hpa_max_purge_nhp` and `hpa_strict_min_purge_interval` options allows us to have steady rate of pages returned back to the system. This provides a strickier latency guarantees as number of `madvise` calls is bounded (and hence number of TLB shootdowns is limited) in exchange to weaker memory usage guarantees.	2024-08-20 10:02:38 -07:00
Shirui Cheng	48f66cf4a2	add a size check when declare a stack array to be less than 2048 bytes	2024-08-06 13:16:09 -07:00
Dmitry Ilvokhin	867c6dd7dc	Option to guard `hpa_min_purge_interval_ms` fix Change in `hpa_min_purge_interval_ms` handling logic is not backward compatible as it might increase memory usage. Now this logic guarded by `hpa_strict_min_purge_interval` option. When `hpa_strict_min_purge_interval` is true, we will purge no more than `hpa_min_purge_interval_ms`. When `hpa_strict_min_purge_interval` is false, old purging logic behaviour is preserved. Long term strategy migrate all users of hpa to new logic and then delete `hpa_strict_min_purge_interval` option.	2024-06-07 10:52:41 -07:00
Dmitry Ilvokhin	90c627edb7	Export hugepage size with `arenas.hugepage`	2024-06-05 15:37:41 -07:00
David Goldblatt	f9c0b5f7f8	Bin batching: add some stats. This lets us easily see what fraction of flush load is being taken up by the bins, and helps guide future optimization approaches (for example: should we prefetch during cache bin fills? It depends on how many objects the average fill pops out of the batch).	2024-05-22 10:30:31 -07:00
David Goldblatt	fc615739cb	Add batching to arena bins. This adds a fast-path for threads freeing a small number of allocations to bins which are not their "home-base" and which encounter lock contention in attempting to do so. In producer-consumer workflows, such small lock hold times can cause lock convoying that greatly increases overall bin mutex contention.	2024-05-22 10:30:31 -07:00
David Goldblatt	c085530c71	Tcache batching: Plumbing In the next commit, we'll start using the batcher to eliminate mutex traffic. To avoid cluttering up that commit with the random bits of busy-work it entails, we'll centralize them here. This commit introduces: - A batched bin type. - The ability to mix batched and unbatched bins in the arena. - Conf parsing to set batches per size and a max batched size. - mallctl access to the corresponding opt-namespace keys. - Stats output of the above.	2024-05-22 10:30:31 -07:00

1 2 3 4 5 ...

389 commits