llama-cpp-turboquant/ggml/src
Piotr Wilkin (ilintar) 6fd4f95367
Fix too relaxed check on CUDA "fast copy" (can_be_transposed) condition (#17332)
* Fix too relaxed check on CUDA "fast copy" (can_be_transposed) condition

* Argh.

* Making CISC happy ;)

* Integrate CONT tests

* Use loopy loop

* Skip new tests for (B)F16 for now.
2025-11-19 10:36:33 +01:00
..
ggml-blas
ggml-cann CANN: fix acl_tensor_ptr usage in ASCEND_310P ROPE (#17347) 2025-11-18 16:41:52 +08:00
ggml-cpu ggml-cpu: Don't pass -mpowerpc64 when -mcpu already implies it (#17308) 2025-11-19 14:19:00 +08:00
ggml-cuda Fix too relaxed check on CUDA "fast copy" (can_be_transposed) condition (#17332) 2025-11-19 10:36:33 +01:00
ggml-hexagon hexagon: various Op fixes (#17135) 2025-11-11 15:25:04 -08:00
ggml-hip
ggml-metal metal : support I32 -> I32 copy (#17317) 2025-11-17 11:52:00 +02:00
ggml-musa
ggml-opencl opencl: fix rms_norm_mul (#17250) 2025-11-15 17:40:14 -08:00
ggml-rpc
ggml-sycl sycl : unify unary kernels with a generic implementation and enable wide operator support (#17213) 2025-11-16 00:52:42 +01:00
ggml-vulkan vulkan: force full subgroups for flash attention to fix intel subgroup crash (#17356) 2025-11-19 08:46:26 +01:00
ggml-webgpu ggml webgpu: faster matrix multiplication/matrix-vector multiplication (#17031) 2025-11-07 19:27:20 -08:00
ggml-zdnn
CMakeLists.txt cmake : add version to all shared object files (#17091) 2025-11-11 13:19:50 +02:00
ggml-alloc.c
ggml-backend-impl.h
ggml-backend-reg.cpp
ggml-backend.cpp sched : fix reserve ignoring user tensor assignments (#17232) 2025-11-13 13:14:02 +01:00
ggml-common.h
ggml-impl.h ggml : add ops SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM (#17063) 2025-11-13 20:54:47 +02:00
ggml-opt.cpp
ggml-quants.c
ggml-quants.h
ggml-threading.cpp
ggml-threading.h
ggml.c ggml : add ops SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM (#17063) 2025-11-13 20:54:47 +02:00
ggml.cpp
gguf.cpp