models : support qwen3.5 series (#19468)
* support qwen3.5 series * remove deepstack for now, and some code clean * code clean * add FULL_ATTENTION_INTERVAL metadata * code clean * reorder v heads for linear attention to avoid expensive interleaved repeat
This commit is contained in:
parent
9a96352729
commit
fc0fe40049
17 changed files with 2096 additions and 10 deletions
|
|
@ -182,7 +182,9 @@ ggml_cgraph * clip_graph_qwen3vl::build() {
|
|||
model.mm_1_w, model.mm_1_b,
|
||||
ffn_op_type::FFN_GELU, -1);
|
||||
|
||||
embeddings = ggml_concat(ctx0, embeddings, deepstack_features, 0); // concat along the feature dimension
|
||||
if (deepstack_features) {
|
||||
embeddings = ggml_concat(ctx0, embeddings, deepstack_features, 0);
|
||||
} // concat along the feature dimension
|
||||
|
||||
// build the graph
|
||||
ggml_build_forward_expand(gf, embeddings);
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue