CUDA: generalized (mma) FA, add Volta support (#17505)
* CUDA: generalized (mma) FA, add Volta support * use struct for MMA FA kernel config --------- Co-authored-by: Aman Gupta <aman>
This commit is contained in:
parent
190c4838bd
commit
2e1c9cd814
10 changed files with 966 additions and 759 deletions
|
|
@ -2279,7 +2279,7 @@ extern "C" {
|
|||
float stop,
|
||||
float step);
|
||||
|
||||
#define GGML_KQ_MASK_PAD 64
|
||||
#define GGML_KQ_MASK_PAD 1
|
||||
|
||||
// q: [n_embd_k, n_batch, n_head, ne3 ]
|
||||
// k: [n_embd_k, n_kv, n_head_kv, ne3 ]
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue