CUDA: generalized (mma) FA, add Volta support (#17505)

* CUDA: generalized (mma) FA, add Volta support

* use struct for MMA FA kernel config

---------

Co-authored-by: Aman Gupta <aman>
This commit is contained in:
Johannes Gäßler 2025-12-03 16:57:05 +01:00 committed by GitHub
parent 190c4838bd
commit 2e1c9cd814
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
10 changed files with 966 additions and 759 deletions

View file

@ -2279,7 +2279,7 @@ extern "C" {
float stop,
float step);
#define GGML_KQ_MASK_PAD 64
#define GGML_KQ_MASK_PAD 1
// q: [n_embd_k, n_batch, n_head, ne3 ]
// k: [n_embd_k, n_kv, n_head_kv, ne3 ]