llama-cpp-turboquant

History

Neo Zhang 213c4a0b81 [SYCL] supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (#20190 ) * support flash-attention for fp32/fp16/Q4/Q5/Q8 * rm warining * update for JIT		2026-03-08 12:00:07 +08:00
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094 )	2025-08-07 13:45:41 +02:00
include	ggml: add GATED_DELTA_NET op (#19504 )	2026-03-07 15:41:10 +08:00
src	[SYCL] supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (#20190 )	2026-03-08 12:00:07 +08:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	ggml : bump version to 0.9.7 (ggml/1425)	2026-02-15 22:24:29 +02:00