Commit graph

1718 commits

Author SHA1 Message Date
David Goldblatt
70c94d7474 Add batcher module.
This can be used to batch up simple operation commands for later use by another
thread.
2024-05-22 10:30:31 -07:00
David Goldblatt
86f4851f5d Add clang static analyzer suppression macro. 2024-05-22 10:30:31 -07:00
Qi Wang
8d8379da44 Fix background_thread creation for the oversize_arena.
Bypassing background thread creation for the oversize_arena used to be an
optimization since that arena had eager purging.  However #2466 changed the
purging policy for the oversize_arena -- specifically it switched to the default
decay time when background_thread is enabled.

This issue is noticable when the number of arenas is low: whenever the total #
of arenas is <= 4 (which is the default max # of background threads), in which
case the purging will be stalled since no background thread is created for the
oversize_arena.
2024-05-02 14:45:18 -07:00
Daniel Hodges
11038ff762 Add support for namespace pids in heap profile names
This change adds support for writing pid namespaces to the filename of a
heap profile. When running with namespaces pids may reused across
namespaces and if mounts are shared where profiles are written there is
not a great way to differentiate profiles between pids.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
Signed-off-by: Daniel Hodges <hodgesd@fb.com>
2024-04-09 10:27:52 -07:00
Shirui Cheng
5081c16bb4 Experimental calloc implementation with using memset on larger sizes 2024-04-04 15:31:56 -07:00
Dmitry Ilvokhin
b2e59a96e1 Introduce getters for page allocator shard stats
Access nactive, ndirty and nmuzzy throught getters and not directly.
There are no functional change, but getters are required to propagate
HPA's statistics up to Page Allocator's statitics.
2024-04-04 12:17:30 -07:00
Amaury Séchet
92aa52c062 Reduce nesting in phn_merge_siblings using an early return. 2024-03-14 13:08:17 -07:00
Amaury Séchet
10d713151d Ensure that the root of a heap is always the best element. 2024-03-14 13:07:45 -07:00
Shirui Cheng
373884ab48 print out all malloc_conf settings in stats 2024-02-29 12:12:44 -08:00
Qi Wang
a2c5267409 HPA: Allow frequent reused alloc to bypass the slab_max_alloc limit, as long as
it's within the huge page size.  These requests do not concern internal
fragmentation with huge pages, since the entire range is expected to be
accessed.
2024-01-18 14:51:04 -08:00
guangli-dai
b1792c80d2 Add LOGs when entrying and exiting free and sdallocx. 2024-01-11 14:37:20 -08:00
guangli-dai
eda05b3994 Fix static analysis warnings. 2024-01-03 14:18:52 -08:00
Shirui Cheng
e4817c8d89 Cleanup cache_bin_info_t* info input args 2023-10-25 10:27:31 -07:00
Qi Wang
3025b021b9 Optimize mutex and bin alignment / locality. 2023-10-23 20:28:26 -07:00
Qi Wang
04d1a87b78 Fix a zero-initializer warning on macOS. 2023-10-18 14:12:43 -07:00
guangli-dai
6fb3b6a8e4 Refactor the tcache initiailization
1. Pre-generate all default tcache ncached_max in tcache_boot;
2. Add getters returning default ncached_max and ncached_max_set;
3. Refactor tcache init so that it is always init with a given setting.
2023-10-18 14:11:46 -07:00
guangli-dai
8a22d10b83 Allow setting default ncached_max for each bin through malloc_conf 2023-10-18 14:11:46 -07:00
guangli-dai
867eedfc58 Fix the bug in dalloc promoted allocations.
An allocation small enough will be promoted so that it does not
share an extent with others.  However, when dalloc, such allocations
may not be dalloc as a promoted one if nbins < SC_NBINS.  This
commit fixes the bug.
2023-10-17 14:53:23 -07:00
guangli-dai
630f7de952 Add mallctl to set and get ncached_max of each cache_bin.
1. `thread_tcache_ncached_max_read_sizeclass` allows users to get the
    ncached_max of the bin with the input sizeclass, passed in through
    oldp (will be upper casted if not an exact bin size is given).
2. `thread_tcache_ncached_max_write` takes in a char array
    representing the settings for bins in the tcache.
2023-10-17 14:53:23 -07:00
guangli-dai
6b197fdd46 Pre-generate ncached_max for all bins for better tcache_max tuning experience. 2023-10-17 14:53:23 -07:00
Shirui Cheng
36becb1302 metadata usage breakdowns: tracking edata and rtree usages 2023-10-11 11:56:01 -07:00
Qi Wang
005f20aa7f Fix comments about malloc_conf to enable logging. 2023-10-04 11:49:10 -07:00
guangli-dai
7a9e4c9073 Mark jemalloc.h as system header to resolve header conflicts. 2023-10-04 11:41:30 -07:00
Qi Wang
72cfdce718 Allocate tcache stack from base allocator
When using metadata_thp, allocate tcache bin stacks from base0, which means they
will be placed on huge pages along with other metadata, instead of mixed with
other regular allocations.

In order to do so, modified the base allocator to support limited reuse: freed
tcached stacks (from thread termination) will be returned to base0 and made
available for reuse, but no merging will be attempted since they were bump
allocated out of base blocks. These reused base extents are managed using
separately allocated base edata_t -- they are cached in base->edata_avail when
the extent is all allocated.

One tricky part is, stats updating must be skipped for such reused extents
(since they were accounted for already, and there is no purging for base). This
requires tracking the "if is reused" state explicitly and bypass the stats
updates when allocating from them.
2023-09-18 12:18:32 -07:00
guangli-dai
a442d9b895 Enable per-tcache tcache_max
1. add tcache_max and nhbins into tcache_t so that they are per-tcache,
   with one auto tcache per thread, it's also per-thread;
2. add mallctl for each thread to set its own tcache_max (of its auto tcache);
3. store the maximum number of items in each bin instead of using a global storage;
4. add tests for the modifications above.
5. Rename `nhbins` and `tcache_maxclass` to `global_do_not_change_nhbins` and `global_do_not_change_tcache_maxclass`.
2023-09-06 10:47:14 -07:00
guangli-dai
fbca96c433 Remove unnecessary parameters for cache_bin_postincrement. 2023-09-06 10:47:14 -07:00
Qi Wang
b71da25b8a Fix reading CPU id using rdtscp.
As pointed out in #2527, the correct register containing CPU id should be ecx
instead edx.
2023-08-28 11:46:39 -07:00
Kevin Svetlitski
da66aa391f Enable a few additional warnings for CI and fix the issues they uncovered
- `-Wmissing-prototypes` and `-Wmissing-variable-declarations` are
  helpful for finding dead code and/or things that should be `static`
  but aren't marked as such.
- `-Wunused-macros` is of similar utility, but for identifying dead macros.
- `-Wunreachable-code` and `-Wunreachable-code-aggressive` do exactly
  what they say: flag unreachable code.
2023-08-11 13:56:23 -07:00
Kevin Svetlitski
d2c9ed3d1e Ensure short read(2)s/write(2)s are properly handled by IO utilities
`read(2)` and `write(2)` may read or write fewer bytes than were
requested. In order to robustly ensure that all of the requested bytes
are read/written, these edge-cases must be handled.
2023-08-11 13:36:24 -07:00
Kevin Svetlitski
4f50f782fa Use compiler-provided assume builtins when available
There are several benefits to this:
1. It's cleaner and more reliable to use the builtin to
   inform the compiler of assumptions instead of hoping that the
   optimizer understands your intentions.
2. `clang` will warn you if any of your assumptions would produce
   side-effects (which the compiler will discard). [This blog post](https://fastcompression.blogspot.com/2019/01/compiler-checked-contracts.html)
   by Yann Collet highlights that a hazard of using the
   `unreachable()`-based method of signaling assumptions is that it
   can sometimes result in additional instructions being generated (see
   [this Godbolt link](https://godbolt.org/z/lKNMs3) from the blog post
   for an example).
2023-08-08 14:59:36 -07:00
Kevin Svetlitski
3aae792b10 Fix infinite purging loop in HPA
As reported in #2449, under certain circumstances it's possible to get
stuck in an infinite loop attempting to purge from the HPA. We now
handle this by validating the HPA settings at the end of
configuration parsing and either normalizing them or aborting depending on
if `abort_conf` is set.
2023-08-08 14:36:19 -07:00
Kevin Svetlitski
424dd61d57 Issue a warning upon directly accessing an arena's bins
An arena's bins should normally be accessed via the `arena_get_bin`
function, which properly takes into account bin-shards. To ensure that
we don't accidentally commit code which incorrectly accesses the bins
directly, we mark the field with `__attribute__((deprecated))` with an
appropriate warning message, and suppress the warning in the few places
where directly accessing the bins is allowed.
2023-08-04 15:47:05 -07:00
Kevin Svetlitski
120abd703a Add support for the deprecated attribute
This is useful for enforcing the usage of getter/setter functions to
access fields which are considered private or have unique access constraints.
2023-08-04 15:47:05 -07:00
Kevin Svetlitski
b01d496646 Add an override for the compile-time malloc_conf to jemalloc_internal_overrides.h 2023-07-31 14:53:15 -07:00
Kevin Svetlitski
8ff7e7d6c3 Remove errant #includes in public jemalloc.h header
In an attempt to make all headers self-contained, I inadvertently added
`#include`s which refer to intermediate, generated headers that aren't
included in the final install. Closes #2489.
2023-07-25 16:26:50 -07:00
Kevin Svetlitski
3e82f357bb Fix all optimization-inhibiting integer-to-pointer casts
Following from PR #2481, we replace all integer-to-pointer casts [which
hide pointer provenance information (and thus inhibit
optimizations)](https://clang.llvm.org/extra/clang-tidy/checks/performance/no-int-to-ptr.html)
with equivalent operations that preserve this information. I have
enabled the corresponding clang-tidy check in our static analysis CI so
that we do not get bitten by this again in the future.
2023-07-24 14:40:42 -07:00
Kevin Svetlitski
4827bb17bd Remove vestigial TCACHE_STATE_* macros 2023-07-24 14:40:42 -07:00
Kevin Svetlitski
7e54dd1ddb Define PROF_TCTX_SENTINEL instead of using magic numbers
This makes the code more readable on its own, and also sets the stage
for more cleanly handling the pointer provenance lints in a following
commit.
2023-07-24 14:40:42 -07:00
Kevin Svetlitski
c49c17f128 Suppress verbose frame address warnings
These warnings are not useful, and make the output of some CI jobs
enormous and difficult to read, so let's suppress them.
2023-07-24 10:44:17 -07:00
Kevin Svetlitski
cdb2c0e02f Implement C23's free_sized and free_aligned_sized
[N2699 - Sized Memory Deallocation](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2699.htm)
introduced two new functions which were incorporated into the C23
standard, `free_sized` and `free_aligned_sized`. Both already have
analogues in Jemalloc, all we are doing here is adding the appropriate
wrappers.
2023-07-20 15:06:41 -07:00
Kevin Svetlitski
41e0b857be Make headers self-contained by fixing #includes
Header files are now self-contained, which makes the relationships
between the files clearer, and crucially allows LSP tools like `clangd`
to function correctly in all of our header files. I have verified that
the headers are self-contained (aside from the various Windows shims) by
compiling them as if they were C files – in a follow-up commit I plan to
add this to CI to ensure we don't regress on this front.
2023-07-14 09:06:32 -07:00
Kevin Svetlitski
856db56f6e Move tsd implementation details into tsd_internals.h
This is a prerequisite to achieving self-contained headers. Previously,
the various tsd implementation headers (`tsd_generic.h`,
`tsd_tls.h`, `tsd_malloc_thread_cleanup.h`, and `tsd_win.h`) relied
implicitly on being included in `tsd.h` after a variety of dependencies
had been defined above them. This commit instead makes these
dependencies explicit by splitting them out into a separate file,
`tsd_internals.h`, which each of the tsd implementation headers includes
directly.
2023-07-14 09:06:32 -07:00
Kevin Svetlitski
36ca0c1b7d Stop concealing pointer provenance in phn_link_get
At least for LLVM, [casting from an integer to a pointer hides provenance information](https://clang.llvm.org/extra/clang-tidy/checks/performance/no-int-to-ptr.html)
and inhibits optimizations. Here's a [Godbolt link](https://godbolt.org/z/5bYPcKoWT)
showing how this change removes a couple unnecessary branches in
`phn_merge_siblings`, which is a very hot function. Canary profiles show
only minor improvements (since most of the cost of this function is in
cache misses), but there's no reason we shouldn't take it.
2023-07-13 15:12:31 -07:00
Kevin Svetlitski
1d9e9c2ed6 Fix inconsistent parameter names between definition/declaration pairs
For the sake of consistency, function definitions and their
corresponding declarations should use the same names for parameters.
I've enabled this check in static analysis to prevent this issue from
occurring again in the future.
2023-07-13 12:59:47 -07:00
Kevin Svetlitski
589c63b424 Make eligible global variables static and/or const
For better or worse, Jemalloc has a significant number of global
variables. Making all eligible global variables `static` and/or `const`
at least makes it slightly easier to reason about them, as these
qualifications communicate to the programmer restrictions on their use
without having to `grep` the whole codebase.
2023-07-06 14:15:12 -07:00
Qi Wang
e249d1a2a1 Remove unreachable code. 2023-07-06 12:06:06 -07:00
Qi Wang
602edd7566 Enabled -Wstrict-prototypes and fixed warnings. 2023-07-06 12:00:02 -07:00
Kevin Svetlitski
5a858c64d6 Reduce the memory overhead of sampled small allocations
Previously, small allocations which were sampled as part of heap
profiling were rounded up to `SC_LARGE_MINCLASS`. This additional memory
usage becomes problematic when the page size is increased, as noted in #2358.

Small allocations are now rounded up to the nearest multiple of `PAGE`
instead, reducing the memory overhead by a factor of 4 in the most
extreme cases.
2023-07-03 16:19:06 -07:00
Kevin Svetlitski
f2e00d2fd3 Remove trailing whitespace
Additionally, added a GitHub Action to ensure no more trailing
whitespace will creep in again in the future.

I'm excluding Markdown files from this check, since trailing whitespace
is significant there, and also excluding `build-aux/install-sh` because
there is significant trailing whitespace on the line that sets
`defaultIFS`.
2023-06-23 11:58:18 -07:00
Kevin Svetlitski
bb0333e745 Fix remaining static analysis warnings
Fix or suppress the remaining warnings generated by static analysis.
This is a necessary step before we can incorporate static analysis into
CI. Where possible, I've preferred to modify the code itself instead of
just disabling the warning with a magic comment, so that if we decide to
use different static analysis tools in the future we will be covered
against them raising similar warnings.
2023-06-23 11:50:29 -07:00