models : support qwen3.5 series (#19468)

* support qwen3.5 series

* remove deepstack for now, and some code clean

* code clean

* add FULL_ATTENTION_INTERVAL metadata

* code clean

* reorder v heads for linear attention to avoid expensive interleaved repeat
This commit is contained in:
JJJYmmm 2026-02-11 00:00:26 +08:00 committed by GitHub
parent 9a96352729
commit fc0fe40049
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
17 changed files with 2096 additions and 10 deletions

View file

@ -182,7 +182,9 @@ ggml_cgraph * clip_graph_qwen3vl::build() {
model.mm_1_w, model.mm_1_b,
ffn_op_type::FFN_GELU, -1);
embeddings = ggml_concat(ctx0, embeddings, deepstack_features, 0); // concat along the feature dimension
if (deepstack_features) {
embeddings = ggml_concat(ctx0, embeddings, deepstack_features, 0);
} // concat along the feature dimension
// build the graph
ggml_build_forward_expand(gf, embeddings);