Pull the tcache-aware routing helpers out of arena into a layer that
sits directly below the public malloc interface:
arena_malloc -> malloc_dispatch_malloc
arena_palloc -> malloc_dispatch_palloc
arena_ralloc -> malloc_dispatch_ralloc
arena_dalloc* -> malloc_dispatch_dalloc*
arena_sdalloc* -> malloc_dispatch_sdalloc*
arena_dalloc_promoted -> malloc_dispatch_dalloc_promoted
The new module (malloc_dispatch.h, malloc_dispatch_inlines.h,
src/malloc_dispatch.c) owns the tcache-vs-fall-through decision; the
only consumer is jemalloc_internal_inlines_c.h. arena keeps a narrower
arena_prof_demote() for the sampled-allocation demotion path.