romenskiy2012/jemalloc

mirror of https://github.com/jemalloc/jemalloc.git synced 2026-04-15 23:21:41 +03:00

Author	SHA1	Message	Date
guangli-dai	6200e8987f	Reformat the codebase with the clang-format 18.	2026-03-10 18:14:33 -07:00
guangli-dai	8347f1045a	Renaming limit_usize_gap to disable_large_size_classes	2025-05-06 14:47:35 -07:00
guangli-dai	c067a55c79	Introducing a new usize calculation policy Converting size to usize is what jemalloc has been done by ceiling size to the closest size class. However, this causes lots of memory wastes with HPA enabled. This commit changes how usize is calculated so that the gap between two contiguous usize is no larger than a page. Specifically, this commit includes the following changes: 1. Adding a build-time config option (--enable-limit-usize-gap) and a runtime one (limit_usize_gap) to guard the changes. When build-time config is enabled, some minor CPU overhead is expected because usize will be stored and accessed apart from index. When runtime option is also enabled (it can only be enabled with the build-time config enabled). a new usize calculation approach wil be employed. This new calculation will ceil size to the closest multiple of PAGE for all sizes larger than USIZE_GROW_SLOW_THRESHOLD instead of using the size classes. Note when the build-time config is enabled, the runtime option is default on. 2. Prepare tcache for size to grow by PAGE over GROUPPAGE. To prepare for the upcoming changes where size class grows by PAGE when larger than NGROUP PAGE, disable the tcache when it is larger than 2 * NGROUP * PAGE. The threshold for tcache is set higher to prevent perf regression as much as possible while usizes between NGROUP * PAGE and 2 * NGROUP * PAGE happen to grow by PAGE. 3. Prepare pac and hpa psset for size to grow by PAGE over GROUP*PAGE For PAC, to avoid having too many bins, arena bins still have the same layout. This means some extra search is needed for a page-level request that is not aligned with the orginal size class: it should also search the heap before the current index since the previous heap might also be able to have some allocations satisfying it. The same changes apply to HPA's psset. This search relies on the enumeration of the heap because not all allocs in the previous heap are guaranteed to satisfy the request. To balance the memory and CPU overhead, we currently enumerate at most a fixed number of nodes before concluding none can satisfy the request during an enumeration. 4. Add bytes counter to arena large stats. To prepare for the upcoming usize changes, stats collected by multiplying alive allocations and the bin size is no longer accurate. Thus, add separate counters to record the bytes malloced and dalloced. 5. Change structs use when freeing to avoid using index2size for large sizes. - Change the definition of emap_alloc_ctx_t - Change the read of both from edata_t. - Change the assignment and usage of emap_alloc_ctx_t. - Change other callsites of index2size. Note for the changes in the data structure, i.e., emap_alloc_ctx_t, will be used when the build-time config (--enable-limit-usize-gap) is enabled but they will store the same value as index2size(szind) if the runtime option (opt_limit_usize_gap) is not enabled. 6. Adapt hpa to the usize changes. Change the settings in sec to limit is usage for sizes larger than USIZE_GROW_SLOW_THRESHOLD and modify corresponding tests. 7. Modify usize calculation and corresponding tests. Change the sz_s2u_compute. Note sz_index2size is not always safe now while sz_size2index still works as expected.	2025-03-06 15:08:13 -08:00
guangli-dai	eda05b3994	Fix static analysis warnings.	2024-01-03 14:18:52 -08:00
Kevin Svetlitski	3e82f357bb	Fix all optimization-inhibiting integer-to-pointer casts Following from PR #2481, we replace all integer-to-pointer casts [which hide pointer provenance information (and thus inhibit optimizations)](https://clang.llvm.org/extra/clang-tidy/checks/performance/no-int-to-ptr.html) with equivalent operations that preserve this information. I have enabled the corresponding clang-tidy check in our static analysis CI so that we do not get bitten by this again in the future.	2023-07-24 14:40:42 -07:00
Qi Wang	602edd7566	Enabled -Wstrict-prototypes and fixed warnings.	2023-07-06 12:00:02 -07:00
Qi Wang	ce0b7ab6c8	Inline the storage for thread name in prof_tdata_t. The previous approach managed the thread name in a separate buffer, which causes races because the thread name update (triggered by new samples) can happen at the same time as prof dumping (which reads the thread names) -- these two operations are under separate locks to avoid blocking each other. Implemented the thread name storage as part of the tdata struct, which resolves the lifetime issue and also avoids internal alloc / dalloc during prof_sample.	2023-04-05 10:03:12 -07:00
Qi Wang	5fd55837bb	Fix thread_name updating for heap profiling. The current thread name reading path updates the name every time, which requires both alloc and dalloc -- and the temporary NULL value in the middle causes races where the prof dump read path gets NULLed in the middle. Minimize the changes in this commit to isolate the bugfix testing; will also refactor the whole thread name paths later.	2023-02-15 17:49:40 -08:00
Guangli Dai	a0734fd6ee	Making jemalloc max stack depth a runtime option	2022-09-12 13:56:22 -07:00
yunxu	b798fabdf7	Add prof_leak_error option The option makes the process to exit with error code 1 if a memory leak is detected. This is useful for implementing automated tools that rely on leak detection.	2022-01-21 16:24:20 -08:00
Qi Wang	d038160f3b	Fix shadowed variable usage. Verified with EXTRA_CFLAGS=-Wshadow.	2021-12-23 10:55:08 -08:00
Yinan Zhang	20f2479ed7	Do not create size class tables for non-prof builds	2020-08-24 20:10:02 -07:00
Yinan Zhang	8efcdc3f98	Move unbias data to prof_data	2020-08-24 20:10:02 -07:00
David Goldblatt	60993697d8	Prof: Add prof_unbias. This gives more accurate attribution of bytes and counts to stack traces, without introducing backwards incompatibilities in heap-profile parsing tools. We track the ideal reported (to the end user) number of bytes more carefully inside core jemalloc. When dumping heap profiles, insteading of outputting our counts directly, we output counts that will cause parsing tools to give a result close to the value we want. We retain the old version as an opt setting, to let users who are tracking values on a per-component basis to keep their metrics stable until they decide to switch.	2020-08-05 18:33:55 -07:00
Yinan Zhang	c2e7a06392	No need to intercept prof_dump_header() in tests	2020-06-29 14:27:50 -07:00
Yinan Zhang	f58ebdff7a	Generalize prof_cnt_all() for testing	2020-06-29 14:27:50 -07:00
Yinan Zhang	d4259ea53b	Simplify signatures for prof dump functions	2020-06-29 14:27:50 -07:00
Yinan Zhang	5d823f3a91	Consolidate struct definitions for prof dump parameters	2020-06-29 14:27:50 -07:00
Yinan Zhang	1f5fe3a3e3	Pass write callback explicitly in prof_data	2020-06-29 14:27:50 -07:00
Yinan Zhang	4556d3c0c8	Define structures for prof dump parameters	2020-06-29 14:27:50 -07:00
Yinan Zhang	dad821bb22	Move unwind to prof_sys	2020-06-29 14:27:50 -07:00
Yinan Zhang	d128efcb6a	Relocate a few prof utilities to the right modules	2020-06-29 14:27:50 -07:00
Yinan Zhang	4736fb4fc9	Move file handling logic in prof_data to prof_sys	2020-06-29 14:27:50 -07:00
Yinan Zhang	adfd9d7b1d	Change tsdn to tsd for thread name allocation	2020-06-29 14:27:50 -07:00
Yinan Zhang	841af2b426	Move thread name handling to prof_data module	2020-06-29 14:27:50 -07:00
Yinan Zhang	c8683bee80	Unify printing for prof counts object	2020-06-29 14:27:50 -07:00
Yinan Zhang	5d292b5660	Push error handling logic out of core dumping logic	2020-06-29 14:27:50 -07:00
Yinan Zhang	354183b10d	Define prof dump buffer size centrally	2020-06-29 14:27:50 -07:00
Yinan Zhang	7455813e57	Make dump file writing replaceable in test	2020-06-29 14:27:50 -07:00
Yinan Zhang	21e44c45d9	Make maps file opening replaceable in test	2020-06-29 14:27:50 -07:00
Yinan Zhang	4bb4037dbe	Extract utility function for opening maps file	2020-06-29 14:27:50 -07:00
Yinan Zhang	f307b25804	Only replace the dump file opening function in test	2020-06-29 14:27:50 -07:00
Yinan Zhang	a795b19327	Remove beginning define in source files ``` sed -i "/^#define JEMALLOC_[A-Z_]_C_$/d" src/.c; ```	2020-06-19 12:15:44 -07:00
Yinan Zhang	b7858abfc0	Expose prof testing internal functions	2020-06-19 09:16:51 -07:00
Yinan Zhang	1e2524e15a	Do not reset sample wait time when re-initing tdata	2020-05-12 09:16:16 -07:00
Yinan Zhang	84b28c6a13	Properly handle tdata deletion race	2020-01-21 16:51:26 -08:00
Yinan Zhang	d331208560	Get rid of redundant logic in prof	2020-01-21 16:51:26 -08:00
Yinan Zhang	b8df719d5c	No tdata creation for backtracing on dying thread	2020-01-16 21:54:14 -08:00
Yinan Zhang	9a60cf54ec	Last-N profiling mode	2019-12-30 15:58:57 -08:00
Yinan Zhang	3fa142cf39	Remove _externs from prof internal header names	2019-12-23 11:14:15 -08:00
Yinan Zhang	ea42174d07	Refactor profiling headers	2019-12-20 17:17:48 -08:00
Yinan Zhang	7d2bac5a38	Refactor destroy code path for prof_tctx	2019-12-10 16:31:05 -08:00
Yinan Zhang	7e3671911f	Get rid of old indentation style for prof	2019-12-06 09:47:51 -08:00
Yinan Zhang	dfdd46f6c1	Refactor prof_tctx_t creation	2019-12-06 09:47:51 -08:00
Qi Wang	da50d8ce87	Refactor and optimize prof sampling initialization. Makes the prof sample prng use the tsd prng_state. This allows us to properly initialize the sample interval event, without having to create tdata. As a result, tdata will be created on demand (when a thread reaches the sample interval bytes allocated), instead of on the first allocation.	2019-11-11 10:35:37 -08:00
Yinan Zhang	66e07f986d	Suppress tdata creation in reentrancy This change suppresses tdata initialization and prof sample threshold update in interrupting malloc calls. Interrupting calls have no need for tdata. Delaying tdata creation aligns better with our lazy tdata creation principle, and it also helps us gain control back from interrupting calls more quickly and reduces any risk of delegating tdata creation to an interrupting call.	2019-10-04 08:52:50 -07:00
Yinan Zhang	07ce2434bf	Refactor profiling Refactored core profiling codebase into two logical parts: (a) `prof_data.c`: core internal data structure managing & dumping; (b) `prof.c`: mutexes & outward-facing APIs. Some internal functions had to be exposed out, but there are not that many of them if the modularization is (hopefully) clean enough.	2019-08-07 19:48:28 -07:00

47 commits