The second expansion attempt in large_ralloc_no_move omitted the !
before large_ralloc_no_move_expand(), inverting the return value.
On expansion failure, the function falsely reported success, making
callers believe the allocation was expanded in-place when it was not.
On expansion success, the function falsely reported failure, causing
callers to unnecessarily allocate, copy, and free.
Add unit test that verifies the return value matches actual size change.
In both the full_slabs and empty_slabs JSON sections of HPA shard
stats, "nactive_huge" was emitted twice instead of emitting
"ndirty_huge" as the second entry. This caused ndirty_huge to be
missing from the JSON output entirely.
Add a unit test that verifies both sections contain "ndirty_huge".
The index validation used > instead of >=, allowing access at index
SC_NBINS (for bins) and SC_NSIZES-SC_NBINS (for lextents), which are
one past the valid range. This caused out-of-bounds reads in bin_infos[]
and sz_index2size_unsafe().
Add unit tests that verify the boundary indices return ENOENT.
These functions had zero callers anywhere in the codebase:
- extent_commit_wrapper: wrapper never called, _impl used directly
- large_salloc: trivial wrapper never called
- tcache_gc_dalloc_new_event_wait: no header declaration, no callers
- tcache_gc_dalloc_postponed_event_wait: no header declaration, no callers
The condition incorrectly used 'alloc_count || 0' which was likely a typo
for 'alloc_count != 0'. While both evaluate similarly for the zero/non-zero
case, the fix ensures consistency with bt_count and thr_count checks and
uses the correct comparison operator.
psset_pick_purge used max_bit-- after rejecting a time-ineligible
candidate, which caused unnecessary re-scanning of the same bitmap
and makes assert fail in debug mode) and a size_t underflow
when the lowest-index entry was rejected. Use max_bit = ind - 1
to skip directly past the rejected index.
tsd_tcache_data_init() returns true on failure but its callers ignore
this return value, leaving the per-thread tcache in an uninitialized
state after a failure.
This change disables the tcache on an initialization failure and logs
an error message. If opt_abort is true, it will also abort.
New unit tests have been added to test tcache initialization failures.
This is a clean-up change that gives the bin functions implemented in
the area code a prefix of bin_ and moves them into the bin code.
To further decouple the bin code from the arena code, bin functions
that had taken an arena_t to check arena_is_auto now take an is_auto
parameter instead.
During mutex stats emit, derived counters are not emitted for json.
Yet the array indexing counter should still be increased to skip
derived elements in the output, which was not. This commit fixes it.
While undocumented, the prctl system call will set errno to ENOMEM
when passed NULL as an address. Under that condition, an assertion
that check for EINVAL as the only possible errno value will fail. To
avoid the assertion failure, this change skips the call to os_page_id
when address is NULL. NULL can only occur after mmap fails in which
case there is no mapping to name.
The address of the local variable created_threads is a different
location than the data it points to. Incorrectly treating these
values as being the same can cause out-of-bounds writes to the stack.
Closes: facebook/jemalloc#59
This change replaces direct comparisons of Pthread thread IDs with
calls to pthread_equal. Directly comparing thread IDs is neither
portable nor reliable since a thread ID is defined as an opaque type
that can be implemented using a structure.
The static inline definition made more sense when these functions just
dispatched to a syscall wrapper. Since they acquired a retry loop, a
non-inline definition makes more sense.
Giving the advice MADV_DONTNEED to a range of virtual memory backed by
a transparent huge page already causes that range of virtual memory to
become backed by regular pages.
any future changes to the underlying data type for bin sizes
(such as upgrading from `uint16_t` to `uint32_t`) can be achieved
by modifying only the `cache_bin_sz_t` definition.
Signed-off-by: Xin Yang <yangxin.dev@bytedance.com>
in the dirty ecache has been limited. This patch was tested with real
workloads using ClickHouse (Clickbench Q35) on a system with 2x240 vCPUs.
The results showed a 2X in query per second (QPS) performance and
a reduction in page faults to 29% of the previous rate. Additionally,
microbenchmark testing involved 256 memory reallocations resizing
from 4KB to 16KB in one arena, which demonstrated a 5X performance
improvement.
Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>