llama-cpp-turboquant/.github/workflows
Eve 3407364776
Q6_K AVX improvements (#10118)
* q6_k instruction reordering attempt

* better subtract method

* should be theoretically faster

small improvement with shuffle lut, likely because all loads are already done at that stage

* optimize bit fiddling

* handle -32 offset separately. bsums exists for a reason!

* use shift

* Update ggml-quants.c

* have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86
2024-11-04 23:06:31 +01:00
..
bench.yml.disabled ggml-backend : add device and backend reg interfaces (#9707) 2024-10-03 01:49:47 +02:00
build.yml Q6_K AVX improvements (#10118) 2024-11-04 23:06:31 +01:00
close-issue.yml ci : fine-grant permission (#9710) 2024-10-04 11:47:19 +02:00
docker.yml musa: add docker image support (#9685) 2024-10-10 20:10:37 +02:00
editorconfig.yml
gguf-publish.yml
labeler.yml
nix-ci-aarch64.yml ci : fine-grant permission (#9710) 2024-10-04 11:47:19 +02:00
nix-ci.yml ci : fine-grant permission (#9710) 2024-10-04 11:47:19 +02:00
nix-flake-update.yml
nix-publish-flake.yml
python-check-requirements.yml
python-lint.yml
python-type-check.yml
server.yml