context : reserve new scheduler when graph topology changes (#18547)

* context : reserve new scheduler when graph topology changes

* cont : fix

* cont : fix reserve

* cont : reserve only when changes occur + timing

* context : add comments

* llama : reserve on sampler changes

* common : allow null common_sampler

* server : task declares needs (embd, logits, sampling)

* server : do not init sampler if not needed

* llama : fix need_reserve when unsetting a sampler

* server : consolidate slot reset/clear logic
This commit is contained in:
Georgi Gerganov 2026-01-15 16:39:17 +02:00 committed by GitHub
parent 5c662d21a3
commit 39173bcacb
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
9 changed files with 328 additions and 216 deletions

View file

@ -81,7 +81,6 @@ int main(int argc, char ** argv) {
sampler_configs.push_back({ i, smpl });
}
// TODO: temporarily gated behind a flag
if (params.sampling.backend_sampling) {
ctx_params.samplers = sampler_configs.data();
ctx_params.n_samplers = sampler_configs.size();