This website requires JavaScript.
Explore
Help
Sign in
thek0tyara
/
llama-cpp-turboquant
Watch
1
Star
0
Fork
You've already forked llama-cpp-turboquant
0
Code
Issues
Pull requests
Projects
Releases
Packages
Wiki
Activity
Actions
4
0f1e9d14cc
llama-cpp-turboquant
/
docs
/
backend
History
Download ZIP
Download TAR.GZ
Neo Zhang
213c4a0b81
[SYCL] supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (
#20190
)
...
* support flash-attention for fp32/fp16/Q4/Q5/Q8 * rm warining * update for JIT
2026-03-08 12:00:07 +08:00
..
snapdragon
chore : correct typos [no ci] (
#20041
)
2026-03-05 08:50:21 +01:00
VirtGPU
ggml-virtgpu: add backend documentation (
#19354
)
2026-02-09 20:15:42 +08:00
BLIS.md
make : deprecate (
#10514
)
2024-12-02 21:22:53 +02:00
CANN.md
chore : correct typos [no ci] (
#20041
)
2026-03-05 08:50:21 +01:00
CUDA-FEDORA.md
docs: update: improve the Fedoa CUDA guide (
#12536
)
2025-03-24 11:02:26 +00:00
OPENCL.md
docs: add linux to index (
#18907
)
2026-01-18 18:03:35 +08:00
SYCL.md
[SYCL] supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (
#20190
)
2026-03-08 12:00:07 +08:00
VirtGPU.md
ggml-virtgpu: improve the reliability of the code (
#19846
)
2026-02-26 20:00:57 +08:00
zDNN.md
ggml-zendnn : add ZenDNN backend for AMD CPUs (
#17690
)
2025-12-07 00:13:33 +08:00
ZenDNN.md
ggml-zendnn: update code for latest ZenDNN API (
#19923
)
2026-02-27 08:43:41 +08:00