Commit graph

306 commits

Author SHA1 Message Date
Bin Liu
be2de8ccd8 Introduce pinned extents to contain unpurgeable pages
Some pages (e.g., hugetlb pages) cannot be purged, and should be
prioritized for reuse.  A custom extent_alloc hook signals this by
OR'ing EXTENT_ALLOC_FLAG_PINNED into the low bits of the returned
pointer; jemalloc strips the flag bits and caches pinned extents in
a dedicated ecache_pinned, separate from the dirty/muzzy decay
pipeline.

Pinned extents do not coalesce eagerly, except for ones larger than
SC_LARGE_MINCLASS.  A prefer-small policy reuses the smallest fitting
pinned extent, to avoid unnecessary split/fragmentation.
2026-05-05 10:44:28 -07:00
Slobodan Predolac
6e0b8e6daa Improve unit test coverage for jemalloc_init, arenas_management, and jemalloc_fork modules 2026-04-30 15:17:18 -04:00
Slobodan Predolac
ba1e2fe4db Extract fork orchestration into jemalloc_fork module 2026-04-30 15:17:18 -04:00
Slobodan Predolac
ec07fc3c5f Extract initialization logic from jemalloc.c into jemalloc_init module 2026-04-30 15:17:18 -04:00
Slobodan Predolac
e0a8401533 Extract arena state and management from jemalloc.c into arenas_management module 2026-04-30 15:17:18 -04:00
lexprfuncall
1a15fe33a4
Replace std::__throw_bad_alloc call with standard C++ (#2900)
* Replace std::__throw_bad_alloc call with standard C++

Since December of 2025, std::__throw_bad_alloc is no longer visible
through #include <new> causing jemalloc build failures with gcc 16.
As far as I can tell, all std::__throw_bad_alloc did was arrange to
raise a std::bad_alloc exception if exceptions are enabled.  I am not
sure whether its usage was truly meaningful in jemalloc since the call
is wrapped in a try catch and any usage of try catch is considered an
error when compiling with -fno-exceptions on gcc, at least.

This change adds a check to configure.ac that determines whether
exceptions are enabled by compiling a simple try catch that raises a
std::bad_alloc exception.  If that test succeeds, the macro
JEMALLOC_HAVE_CXX_EXCEPTIONS is defined, and jemalloc will raise an
exception.  Otherwise, we call std::terminate() to abort.

This was tested on FreeBSD with the gcc16 port with and without exceptions
enabled.

* Replace std::set_new_handler calls with std::get_new_handler

Previously, std::set_new_handler was used as a workaround for
compilers with only partial support for C++11.  Now that C++14 is a
requirement to enable C++ support, we can assume std::get_new_handler
is available.
2026-04-27 11:50:27 -07:00
guangli-dai
bb0a6aca10 Allow spaces in prefix. 2026-04-27 09:59:59 -07:00
Integral
1f44a8b11d
Change permission modes of static libraries to 644 (#2885)
Static libraries are archives and do not need the executable bit.
2026-04-19 14:13:02 -07:00
Slobodan Predolac
a0f2bdf91d Fix missing negation in large_ralloc_no_move usize_min fallback
The second expansion attempt in large_ralloc_no_move omitted the !
before large_ralloc_no_move_expand(), inverting the return value.
On expansion failure, the function falsely reported success, making
callers believe the allocation was expanded in-place when it was not.
On expansion success, the function falsely reported failure, causing
callers to unnecessarily allocate, copy, and free.

Add unit test that verifies the return value matches actual size change.
2026-04-01 23:15:19 -04:00
Carl Shapiro
86b7219213 Add unit tests for conf parsing and its helpers 2026-03-10 18:14:33 -07:00
Carl Shapiro
ad726adf75 Separate out the configuration code from initialization 2026-03-10 18:14:33 -07:00
Carl Shapiro
a056c20d67 Handle tcache init failures gracefully
tsd_tcache_data_init() returns true on failure but its callers ignore
this return value, leaving the per-thread tcache in an uninitialized
state after a failure.

This change disables the tcache on an initialization failure and logs
an error message.  If opt_abort is true, it will also abort.

New unit tests have been added to test tcache initialization failures.
2026-03-10 18:14:33 -07:00
Carl Shapiro
a75655badf Add unit test coverage for bin interfaces 2026-03-10 18:14:33 -07:00
guangli-dai
c73ab1c2ff Add a test to check the output in JSON-based stats is consistent with mallctl results. 2026-03-10 18:14:33 -07:00
Slobodan Predolac
34ace9169b Remove prof_threshold built-in event. It is trivial to implement it as user event if needed 2026-03-10 18:14:33 -07:00
Andrei Pechkurov
4d0ffa075b Fix background thread initialization race 2026-03-10 18:14:33 -07:00
Slobodan Predolac
6016d86c18 [SEC] Make SEC owned by hpa_shard, simplify the code, add stats, lock per bin 2026-03-10 18:14:33 -07:00
Slobodan Predolac
8a06b086f3 [EASY] Extract hpa_central component from hpa source file 2026-03-10 18:14:33 -07:00
Slobodan Predolac
355774270d [EASY] Encapsulate better, do not pass hpa_shard when hooks are enough, move shard independent actions to hpa_utils 2026-03-10 18:14:33 -07:00
Slobodan Predolac
3678a57c10 When extracting from central, hugify_eager is different than start_as_huge 2026-03-10 18:14:33 -07:00
guangli-dai
261591f123 Add a page-allocator microbenchmark. 2026-03-10 18:14:33 -07:00
guangli-dai
56cdce8592 Adding trace analysis in preparation for page allocator microbenchmark. 2026-03-10 18:14:33 -07:00
Carl Shapiro
daf44173c5 Replace an instance of indentation with spaces with tabs 2026-03-10 18:14:33 -07:00
Shirui Cheng
2114349a4e Revert PR #2608: Manually revert commits 70c94d..f9c0b5
Closes: #2707
2026-03-10 18:14:33 -07:00
Slobodan Predolac
e6864c6075 [thread_event] Remove macros from thread_event and replace with dynamic event objects 2026-03-10 18:14:33 -07:00
Jason Evans
27d7960cf9 Revert "Extend purging algorithm with peak demand tracking"
This reverts commit ad108d50f1.
2025-06-02 10:44:37 -07:00
Slobodan Predolac
1956a54a43 [process_madvise] Use process_madvise across multiple huge_pages 2025-04-25 19:19:03 -07:00
Slobodan Predolac
f19f49ef3e if process_madvise is supported, call it when purging hpa 2025-04-04 13:57:42 -07:00
Dmitry Ilvokhin
ad108d50f1 Extend purging algorithm with peak demand tracking
Implementation inspired by idea described in "Beyond malloc efficiency
to fleet efficiency: a hugepage-aware memory allocator" paper [1].

Primary idea is to track maximum number (peak) of active pages in use
with sliding window and then use this number to decide how many dirty
pages we would like to keep.

We are trying to estimate maximum amount of active memory we'll need in
the near future. We do so by projecting future active memory demand
(based on peak active memory usage we observed in the past within
sliding window) and adding slack on top of it (an overhead is reasonable
to have in exchange of higher hugepages coverage). When peak demand
tracking is off, projection of future active memory is active memory we
are having right now.

Estimation is essentially the same as `nactive_max * (1 + dirty_mult)`.

Peak demand purging algorithm controlled by two config options. Option
`hpa_peak_demand_window_ms` controls duration of sliding window we track
maximum active memory usage in and option `hpa_dirty_mult` controls
amount of slack we are allowed to have as a percent from maximum active
memory usage. By default `hpa_peak_demand_window_ms == 0` now and we
have same behaviour (ratio based purging) that we had before this
commit.

[1]: https://storage.googleapis.com/gweb-research2023-media/pubtools/6170.pdf
2025-03-13 10:12:22 -07:00
Shai Duvdevani
257e64b968 Unlike prof_sample which is supported only with profiling mode active, prof_threshold is intended to be an always-supported allocation callback with much less overhead. The usage of the threshold allows performance critical callers to change program execution based on the callback: e.g. drop caches when memory becomes high or to predict the program is about to OOM ahead of time using peak memory watermarks. 2025-01-29 18:55:52 -08:00
Dmitry Ilvokhin
3820e38dc1 Remove validation for HPA ratios
Config validation was introduced at 3aae792b with main intention to fix
infinite purging loop, but it didn't actually fix the underlying
problem, just masked it. Later 47d69b4ea was merged to address the same
problem.

Options `hpa_dirty_mult` and `hpa_hugification_threshold` have different
application dimensions: `hpa_dirty_mult` applied to active memory on the
shard, but `hpa_hugification_threshold` is a threshold for single
pageslab (hugepage). It doesn't make much sense to sum them up together.

While it is true that too high value of `hpa_dirty_mult` and too low
value of `hpa_hugification_threshold` can lead to pathological
behaviour, it is true for other options as well. Poor configurations
might lead to suboptimal and sometimes completely unacceptable
behaviour and that's OK, that is exactly the reason why they are called
poor.

There are other mechanism exist to prevent extreme behaviour, when we
hugified and then immediately purged page, see
`hpa_hugify_blocked_by_ndirty` function, which exist to prevent exactly
this case.

Lastly, `hpa_dirty_mult + hpa_hugification_threshold >= 1` constraint is
too tight and prevents a lot of valid configurations.
2024-11-20 18:59:07 -08:00
Nathan Slingerland
edc1576f03 Add safe frame-pointer backtrace unwinder 2024-10-01 11:01:56 -07:00
David Goldblatt
fc615739cb Add batching to arena bins.
This adds a fast-path for threads freeing a small number of allocations to
bins which are not their "home-base" and which encounter lock contention in
attempting to do so. In producer-consumer workflows, such small lock hold times
can cause lock convoying that greatly increases overall bin mutex contention.
2024-05-22 10:30:31 -07:00
David Goldblatt
70c94d7474 Add batcher module.
This can be used to batch up simple operation commands for later use by another
thread.
2024-05-22 10:30:31 -07:00
guangli-dai
8a22d10b83 Allow setting default ncached_max for each bin through malloc_conf 2023-10-18 14:11:46 -07:00
guangli-dai
630f7de952 Add mallctl to set and get ncached_max of each cache_bin.
1. `thread_tcache_ncached_max_read_sizeclass` allows users to get the
    ncached_max of the bin with the input sizeclass, passed in through
    oldp (will be upper casted if not an exact bin size is given).
2. `thread_tcache_ncached_max_write` takes in a char array
    representing the settings for bins in the tcache.
2023-10-17 14:53:23 -07:00
Kevin Svetlitski
3aae792b10 Fix infinite purging loop in HPA
As reported in #2449, under certain circumstances it's possible to get
stuck in an infinite loop attempting to purge from the HPA. We now
handle this by validating the HPA settings at the end of
configuration parsing and either normalizing them or aborting depending on
if `abort_conf` is set.
2023-08-08 14:36:19 -07:00
Kevin Svetlitski
ebd7e99f5c Add a test-case for small profiled allocations
Validate that small allocations (i.e. those with `size <= SC_SMALL_MAXCLASS`)
which are sampled for profiling maintain the expected invariants even
though they now take up less space.
2023-07-03 16:19:06 -07:00
barracuda156
4422f88d17 Makefile.in: link with g++ when cxx enabled 2023-02-21 13:26:58 -08:00
guangli-dai
06374d2a6a Benchmark operator delete
Added the microbenchmark for operator delete.
Also modified bench.h so that it can be used in C++.
2022-11-21 11:14:05 -08:00
divanorama
3de0c24859 Disable builtin malloc in tests
With `--with-jemalloc-prefix=` and without `-fno-builtin` or `-O1` both clang and gcc may optimize out `malloc` calls
whose result is unused. Comparing result to NULL also doesn't necessarily count as being used.

This won't be a problem in most client programs as this only concerns really unused pointers, but in
tests it's important to actually execute allocations.
`-fno-builtin` should disable this optimization for both gcc and clang, and applying it only to tests code shouldn't hopefully be an issue.
Another alternative is to force "use" of result but that'd require more changes and may miss some other optimization-related issues.

This should resolve https://github.com/jemalloc/jemalloc/issues/2091
2022-10-03 10:39:13 -07:00
Alex Lapenkou
df7ad8a9b6 Revert "Echo installed files via verbose 'install' command"
This reverts commit f15d8f3b41. "install -v"
turned out to be not portable and not work on NetBSD.
2022-06-07 12:28:45 -07:00
Qi Wang
0e29ad4efa Rename zero_realloc option "strict" to "alloc".
With realloc(ptr, 0) being UB per C23, the option name "strict" makes less sense
now.  Rename to "alloc" which describes the behavior.
2022-04-20 10:27:25 -07:00
Charles
eaaa368bab Add comments and use meaningful vars in sz_psz2ind. 2022-03-24 16:56:59 -07:00
Shuduo Sang
640c3c72e6 Add support for 'make uninstall' 2022-01-19 12:28:16 -08:00
Alex Lapenkou
f15d8f3b41 Echo installed files via verbose 'install' command
It's not necessary to manually echo all install commands, similar effect
is achieved via 'install -v'
2022-01-19 12:28:16 -08:00
Qi Wang
011449f17b Fix doc build with install-suffix. 2022-01-11 21:15:24 -08:00
Qi Wang
b75822bc6e Implement use-after-free detection using junk and stash.
On deallocation, sampled pointers (specially aligned) get junked and stashed
into tcache (to prevent immediate reuse).  The expected behavior is to have
read-after-free corrupted and stopped by the junk-filling, while
write-after-free is checked when flushing the stashed pointers.
2021-12-29 14:44:43 -08:00
Alex Lapenkou
0f6da1257d San: Implement bump alloc
The new allocator will be used to allocate guarded extents used as slabs
for guarded small allocations.
2021-12-15 10:39:17 -08:00
Alex Lapenkou
62f9c54d2a San: Rename 'guard' to 'san'
This prepares the foundation for more sanitizer-related work in the
future.
2021-12-15 10:39:17 -08:00