llama-cli as a different show off program now. So, switch to older
llama-completion for testing and to llama-server for versioning and
man-page generation.
Signed-off-by: Vitaly Chikunov <vt@altlinux.org>
+ mkdir /usr/src/.npm-global
+ npm config set prefix /usr/src/.npm-global
+ npm install -g @aikidosec/safe-chain
added 57 packages in 3s
5 packages are looking for funding
run `npm fund` for details
+ PATH=/usr/src/.npm-global/bin:/usr/bin:/bin:/usr/local/bin
+ cd /usr/src/build
+ rm tools/server/public/index.html.gz
+ cd tools/server/webui
+ aikido-npm ci --ignore-scripts
added 547 packages, and audited 548 packages in 6s
218 packages are looking for funding
run `npm fund` for details
7 vulnerabilities (6 low, 1 moderate)
To address issues that do not require attention, run:
npm audit fix
To address all issues (including breaking changes), run:
npm audit fix --force
Run `npm audit` for details.
✔ Safe-chain: Scanned 541 packages, no malware found.
+ aikido-npm audit --audit-level=critical fix
changed 2 packages, and audited 548 packages in 7s
218 packages are looking for funding
run `npm fund` for details
# npm audit report
cookie <0.7.0
cookie accepts cookie name, path, and domain with out of bounds characters - https://github.com/advisories/GHSA-pxg6-pf52-xh8x
fix available via `npm audit fix --force`
Will install @sveltejs/kit@0.0.30, which is a breaking change
node_modules/cookie
@sveltejs/kit >=1.0.0-next.0
Depends on vulnerable versions of cookie
node_modules/@sveltejs/kit
@sveltejs/adapter-static >=1.0.0-next.0
Depends on vulnerable versions of @sveltejs/kit
node_modules/@sveltejs/adapter-static
runed >=0.32.0
Depends on vulnerable versions of @sveltejs/kit
node_modules/bits-ui/node_modules/runed
bits-ui >=2.11.8
Depends on vulnerable versions of runed
Depends on vulnerable versions of svelte-toolbelt
node_modules/bits-ui
svelte-toolbelt >=0.10.6
Depends on vulnerable versions of runed
node_modules/bits-ui/node_modules/svelte-toolbelt
6 low severity vulnerabilities
To address issues that do not require attention, run:
npm audit fix
To address all issues (including breaking changes), run:
npm audit fix --force
✔ Safe-chain: Scanned 2 packages, no malware found.
ℹ Safe-chain: Some package versions were suppressed due to minimum age requirement.
To disable this check, use: --safe-chain-skip-minimum-package-age
+ du -sh node_modules
409M node_modules
+ npm run build
> webui@1.0.0 build
> vite build && ./scripts/post-build.sh
▲ [WARNING] Cannot find base config file "./.svelte-kit/tsconfig.json" [tsconfig.json]
tsconfig.json:2:12:
2 │ "extends": "./.svelte-kit/tsconfig.json",
╵ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
vite v7.2.2 building ssr environment for production...
transforming...
DEPRECATION WARNING [import]: Sass @import rules are deprecated and will be removed in Dart Sass 3.0.0.
More info and automated migrator: https://sass-lang.com/d/import
╷
17 │ @import 'katex/src/styles/katex.scss';
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
╵
src/styles/katex-custom.scss 17:9 root stylesheet
DEPRECATION WARNING [import]: Sass @import rules are deprecated and will be removed in Dart Sass 3.0.0.
More info and automated migrator: https://sass-lang.com/d/import
╷
2 │ @import "./fonts.scss";
│ ^^^^^^^^^^^^^^
╵
node_modules/katex/src/styles/katex.scss 2:9 @import
src/styles/katex-custom.scss 17:9 root stylesheet
DEPRECATION WARNING [global-builtin]: Global built-in functions are deprecated and will be removed in Dart Sass 3.0.0.
Use list.append instead.
More info and automated migrator: https://sass-lang.com/d/import
╷
9 │ $src: append($src, url('#{$font-folder}/KaTeX_#{$family}-#{$family-suffix}.woff2') format('woff2'), comma);
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
╵
node_modules/katex/src/styles/fonts.scss 9:15 generate-src()
node_modules/katex/src/styles/fonts.scss 42:11 font-face()
node_modules/katex/src/styles/fonts.scss 52:1 @import
node_modules/katex/src/styles/katex.scss 2:9 @import
src/styles/katex-custom.scss 17:9 root stylesheet
DEPRECATION WARNING [global-builtin]: Global built-in functions are deprecated and will be removed in Dart Sass 3.0.0.
Use list.length instead.
More info and automated migrator: https://sass-lang.com/d/import
╷
344 │ @for $from from 1 through length($sizes) {
│ ^^^^^^^^^^^^^^
╵
node_modules/katex/src/styles/katex.scss 344:35 @import
src/styles/katex-custom.scss 17:9 root stylesheet
DEPRECATION WARNING [global-builtin]: Global built-in functions are deprecated and will be removed in Dart Sass 3.0.0.
Use list.length instead.
More info and automated migrator: https://sass-lang.com/d/import
╷
345 │ @for $to from 1 through length($sizes) {
│ ^^^^^^^^^^^^^^
╵
node_modules/katex/src/styles/katex.scss 345:37 @import
src/styles/katex-custom.scss 17:9 root stylesheet
DEPRECATION WARNING [global-builtin]: Global built-in functions are deprecated and will be removed in Dart Sass 3.0.0.
Use list.nth instead.
More info and automated migrator: https://sass-lang.com/d/import
╷
348 │ font-size: calc((nth($sizes, $to) / nth($sizes, $from)) * 1em);
│ ^^^^^^^^^^^^^^^^
╵
node_modules/katex/src/styles/katex.scss 348:38 @import
src/styles/katex-custom.scss 17:9 root stylesheet
DEPRECATION WARNING [global-builtin]: Global built-in functions are deprecated and will be removed in Dart Sass 3.0.0.
Use list.nth instead.
More info and automated migrator: https://sass-lang.com/d/import
╷
348 │ font-size: calc((nth($sizes, $to) / nth($sizes, $from)) * 1em);
│ ^^^^^^^^^^^^^^^^^^
╵
node_modules/katex/src/styles/katex.scss 348:57 @import
src/styles/katex-custom.scss 17:9 root stylesheet
node_modules/@sveltejs/kit/src/runtime/client/client.js (328:15): "fork" is not exported by "node_modules/svelte/src/index-server.js", imported by "node_modules/@sveltejs/kit/src/runtime/client/client.js".
node_modules/@sveltejs/kit/src/runtime/client/client.js (333:26): "fork" is not exported by "node_modules/svelte/src/index-server.js", imported by "node_modules/@sveltejs/kit/src/runtime/client/client.js".
"default" is imported from external module "highlight.js" but never used in "src/lib/components/app/misc/SyntaxHighlightedCode.svelte".
✓ 4614 modules transformed.
Export "getJsonHeaders" of module "src/lib/utils/api-headers.ts" was reexported through module "src/lib/utils/index.ts" while both modules are dependencies of each other and will end up in different chunks by current Rollup settings. This scenario is not well supported at the moment as it will produce a circular dependency between chunks and will likely lead to broken execution order.
Either change the import in "src/lib/services/models.ts" to point directly to the exporting module or reconfigure "output.manualChunks" to ensure these modules end up in the same chunk.
Export "getJsonHeaders" of module "src/lib/utils/api-headers.ts" was reexported through module "src/lib/utils/index.ts" while both modules are dependencies of each other and will end up in different chunks by current Rollup settings. This scenario is not well supported at the moment as it will produce a circular dependency between chunks and will likely lead to broken execution order.
Either change the import in "src/lib/services/chat.ts" to point directly to the exporting module or reconfigure "output.manualChunks" to ensure these modules end up in the same chunk.
rendering chunks...
vite v7.2.2 building client environment for production...
transforming...
DEPRECATION WARNING [import]: Sass @import rules are deprecated and will be removed in Dart Sass 3.0.0.
More info and automated migrator: https://sass-lang.com/d/import
╷
17 │ @import 'katex/src/styles/katex.scss';
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
╵
src/styles/katex-custom.scss 17:9 root stylesheet
DEPRECATION WARNING [import]: Sass @import rules are deprecated and will be removed in Dart Sass 3.0.0.
More info and automated migrator: https://sass-lang.com/d/import
╷
2 │ @import "./fonts.scss";
│ ^^^^^^^^^^^^^^
╵
node_modules/katex/src/styles/katex.scss 2:9 @import
src/styles/katex-custom.scss 17:9 root stylesheet
DEPRECATION WARNING [global-builtin]: Global built-in functions are deprecated and will be removed in Dart Sass 3.0.0.
Use list.append instead.
More info and automated migrator: https://sass-lang.com/d/import
╷
9 │ $src: append($src, url('#{$font-folder}/KaTeX_#{$family}-#{$family-suffix}.woff2') format('woff2'), comma);
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
╵
node_modules/katex/src/styles/fonts.scss 9:15 generate-src()
node_modules/katex/src/styles/fonts.scss 42:11 font-face()
node_modules/katex/src/styles/fonts.scss 52:1 @import
node_modules/katex/src/styles/katex.scss 2:9 @import
src/styles/katex-custom.scss 17:9 root stylesheet
DEPRECATION WARNING [global-builtin]: Global built-in functions are deprecated and will be removed in Dart Sass 3.0.0.
Use list.length instead.
More info and automated migrator: https://sass-lang.com/d/import
╷
344 │ @for $from from 1 through length($sizes) {
│ ^^^^^^^^^^^^^^
╵
node_modules/katex/src/styles/katex.scss 344:35 @import
src/styles/katex-custom.scss 17:9 root stylesheet
DEPRECATION WARNING [global-builtin]: Global built-in functions are deprecated and will be removed in Dart Sass 3.0.0.
Use list.length instead.
More info and automated migrator: https://sass-lang.com/d/import
╷
345 │ @for $to from 1 through length($sizes) {
│ ^^^^^^^^^^^^^^
╵
node_modules/katex/src/styles/katex.scss 345:37 @import
src/styles/katex-custom.scss 17:9 root stylesheet
DEPRECATION WARNING [global-builtin]: Global built-in functions are deprecated and will be removed in Dart Sass 3.0.0.
Use list.nth instead.
More info and automated migrator: https://sass-lang.com/d/import
╷
348 │ font-size: calc((nth($sizes, $to) / nth($sizes, $from)) * 1em);
│ ^^^^^^^^^^^^^^^^
╵
node_modules/katex/src/styles/katex.scss 348:38 @import
src/styles/katex-custom.scss 17:9 root stylesheet
DEPRECATION WARNING [global-builtin]: Global built-in functions are deprecated and will be removed in Dart Sass 3.0.0.
Use list.nth instead.
More info and automated migrator: https://sass-lang.com/d/import
╷
348 │ font-size: calc((nth($sizes, $to) / nth($sizes, $from)) * 1em);
│ ^^^^^^^^^^^^^^^^^^
╵
node_modules/katex/src/styles/katex.scss 348:57 @import
src/styles/katex-custom.scss 17:9 root stylesheet
node_modules/@sveltejs/kit/src/runtime/client/client.js (327:15): "fork" is not exported by "node_modules/svelte/src/index-client.js", imported by "node_modules/@sveltejs/kit/src/runtime/client/client.js".
node_modules/@sveltejs/kit/src/runtime/client/client.js (332:26): "fork" is not exported by "node_modules/svelte/src/index-client.js", imported by "node_modules/@sveltejs/kit/src/runtime/client/client.js".
✓ 5349 modules transformed.
rendering chunks...
computing gzip size...
.svelte-kit/output/client/_app/version.json 0.03 kB │ gzip: 0.05 kB
.svelte-kit/output/client/.vite/manifest.json 0.33 kB │ gzip: 0.19 kB
.svelte-kit/output/client/_app/immutable/assets/style.N_mC9UMG.css 479.93 kB │ gzip: 285.89 kB
.svelte-kit/output/client/_app/immutable/bundle.BAOV59Mi.js 3,753.32 kB │ gzip: 1,111.67 kB
(!) Some chunks are larger than 3072 kB after minification. Consider:
- Using dynamic import() to code-split the application
- Use build.rollupOptions.output.manualChunks to improve chunking: https://rollupjs.org/configuration-options/#output-manualchunks
- Adjust chunk size limit for this warning via build.chunkSizeWarningLimit.
✓ built in 11.35s
.svelte-kit/output/server/.vite/manifest.json 5.54 kB
.svelte-kit/output/server/_app/immutable/assets/style.DlfPpIJV.css 479.65 kB
.svelte-kit/output/server/chunks/false.js 0.03 kB
.svelte-kit/output/server/chunks/environment.js 0.07 kB
.svelte-kit/output/server/chunks/api-key-validation.js 0.11 kB
.svelte-kit/output/server/entries/pages/_page.ts.js 0.17 kB
.svelte-kit/output/server/entries/pages/chat/_id_/_page.ts.js 0.18 kB
.svelte-kit/output/server/chunks/server.js 0.20 kB
.svelte-kit/output/server/internal.js 0.35 kB
.svelte-kit/output/server/chunks/utils.js 0.62 kB
.svelte-kit/output/server/entries/pages/_page.svelte.js 0.98 kB
.svelte-kit/output/server/entries/pages/chat/_id_/_page.svelte.js 1.00 kB
.svelte-kit/output/server/chunks/label.js 2.22 kB
.svelte-kit/output/server/chunks/exports.js 3.05 kB
.svelte-kit/output/server/chunks/supported-file-types.js 5.54 kB
.svelte-kit/output/server/chunks/internal.js 6.02 kB
.svelte-kit/output/server/chunks/shared.js 7.26 kB
.svelte-kit/output/server/entries/pages/_error.svelte.js 8.38 kB
.svelte-kit/output/server/remote-entry.js 8.65 kB
.svelte-kit/output/server/chunks/index.js 21.41 kB
.svelte-kit/output/server/entries/pages/_layout.svelte.js 34.76 kB
.svelte-kit/output/server/index.js 56.22 kB
.svelte-kit/output/server/chunks/SyntaxHighlightedCode.svelte_svelte_type_style_lang.js 84.65 kB
.svelte-kit/output/server/chunks/DialogConfirmation.js 107.97 kB
.svelte-kit/output/server/chunks/ServerLoadingSplash.js 201.83 kB
✓ built in 20.66s
Run npm run preview to preview your production build locally.
> Using @sveltejs/adapter-static
Overwriting ../public/index.html with fallback page. Consider using a different name for the fallback.
Wrote site to "../public"
✔ done
✓ Inlined favicon.svg as base64 data URL
✓ Created index.html.gz
Diff-After-Merge: 1 file changed, 6 insertions(+)
# gpg: Signature made Sun Dec 14 00:02:43 2025 MSK
# gpg: using RSA key B5690EEEBB952194
# gpg: Good signature from "GitHub <noreply@github.com>" [unknown]
When the number of cols is large, split each row across multiple workgroups.
There are three phases that communicate partial results through temp buffers:
(1) compute max partials
(2) take max of partials, compute sum(exp(x-max)) partials
(3) sum partials, compute scaled result
* model-conversion : use CONVERTED_MODEL value for converted model [no ci]
This commit updates the model verification scripts to use the
CONVERTED_MODEL environment variable instead of using the MODEL_PATH
(the original model path) as the basis for the converted model file
name.
The motivation for this that currently if the converted model file name
differs from the original model directory/name the verification scripts
will look for the wrong .bin files that were generating when running the
models.
For example, the following steps were not possible:
```console
(venv) $ huggingface-cli download google/gemma-3-270m-it --local-dir ggml-org/gemma-3-270m
(venv) $ python3 convert_hf_to_gguf.py ggml-org/gemma-3-270m --outfile test-bf16.gguf --outtype bf16
(venv) $ cd examples/model-conversion/
(venv) $ export MODEL_PATH=../../ggml-org/gemma-3-270m
(venv) $ export CONVERTED_MODEL=../../test-bf16.gguf
(venv) $ make causal-verify-logits
...
Data saved to data/llamacpp-test-bf16.bin
Data saved to data/llamacpp-test-bf16.txt
Error: llama.cpp logits file not found: data/llamacpp-gemma-3-270m.bin
Please run scripts/run-converted-model.sh first to generate this file.
make: *** [Makefile:62: causal-verify-logits] Error 1
```
With the changes in this commit, the above steps will now work as
expected.
* clip: move model cgraphs into their own files
* more explicit enums
* fix linux build
* fix naming
* missing headers
* nits: add comments for contributors
* ggml-cpu:fix RISC-V Q4_0 repack select and RVV feature reporting
Signed-off-by: Wang Yang <yangwang@iscas.ac.cn>
* using the name VLEN instead of CNT
* Update ggml/include/ggml-cpu.h
---------
Signed-off-by: Wang Yang <yangwang@iscas.ac.cn>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit removes the maximum difference check from the
compare-logits.py which would stop early if the difference between
the logits exceeded a threshold.
The motivation for removing this is that it can be useful to be able to
get the complete log for debugging/reporting purposes.
* enable mmf for RDNA3
* disable mmf for some shape
* move some mmvf to mmf
* more mmfv to mmf
* 3 is good in mmvf
---------
Co-authored-by: zhang hui <you@example.com>
* webui: add search field to model selector and fixes mobile viewport overflow
* webui: simplify model search style and code
* refacor: Search Input component & consistent UI for Models Selector search
* feat: Use Popover component + improve interactions
* fix: Fetching props for only loaded models in ROUTER mode
* webui: prevent models selector popover from overflowing viewport
Use Floating UI's auto-positioning with 50dvh height limit and proper
collision detection instead of forcing top positioning. Fixes overflow
on desktop and mobile keyboard issues
* webui: keep search field near trigger in models selector
Place search at the 'near end' (closest to trigger) by swapping layout
with CSS flexbox order based on popover direction. Prevents input from
moving during typing as list shrinks
* chore: update webui build output
---------
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
* Extended TRI
* Fix whitespace
* chore: update webui build output
* Just use cuBLAS for everything...
* Merge both versions
* Remove incorrect imports causing failures for CI
* Still failing... remove all direct cublas imports and rely on common imports from "common.cuh"
* Defines for hipBlas
* Aaaand MUSA defines...
* I hate this job...
* Stupid typo...
* Update ggml/src/ggml-cuda/solve_tri.cu
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
---------
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
* fix test failure
* fix: correct scaling calculations in rope_cache_init
* fix: optimize element copying in rope_hex_f32 using memcpy
* fix: optimize loop boundaries in rope_hex_f32 for better performance
* feat: add profiling macros for performance measurement in operations
* clip: add support for fused qkv in build_vit
* use bulid_ffn whenever possible
* fix internvl
* mtmd-cli: move image to beginning
* test script: support custom args
* llama-server: recursive GGUF loading
Replace flat directory scan with recursive traversal using
std::filesystem::recursive_directory_iterator. Support for
nested vendor/model layouts (e.g. vendor/model/*.gguf).
Model name now reflects the relative path within --models-dir
instead of just the filename. Aggregate files by parent
directory via std::map before constructing local_model
* server : router config POC (INI-based per-model settings)
* server: address review feedback from @aldehir and @ngxson
PEG parser usage improvements:
- Simplify parser instantiation (remove arena indirection)
- Optimize grammar usage (ws instead of zero_or_more, remove optional wrapping)
- Fix last line without newline bug (+ operator instead of <<)
- Remove redundant end position check
Feature scope:
- Remove auto-reload feature (will be separate PR per @ngxson)
- Keep config.ini auto-creation and template generation
- Preserve per-model customization logic
Co-authored-by: aldehir <aldehir@users.noreply.github.com>
Co-authored-by: ngxson <ngxson@users.noreply.github.com>
* server: adopt aldehir's line-oriented PEG parser
Complete rewrite of INI parser grammar and visitor:
- Use p.chars(), p.negate(), p.any() instead of p.until()
- Support end-of-line comments (key=value # comment)
- Handle EOF without trailing newline correctly
- Strict identifier validation ([a-zA-Z_][a-zA-Z0-9_.-]*)
- Simplified visitor (no pending state, no trim needed)
- Grammar handles whitespace natively via eol rule
Business validation preserved:
- Reject section names starting with LLAMA_ARG_*
- Accept only keys starting with LLAMA_ARG_*
- Require explicit section before key-value pairs
Co-authored-by: aldehir <aldehir@users.noreply.github.com>
* server: fix CLI/env duplication in child processes
Children now receive minimal CLI args (executable, model, port, alias)
instead of inheriting all router args. Global settings pass through
LLAMA_ARG_* environment variables only, eliminating duplicate config
warnings.
Fixes: Router args like -ngl, -fa were passed both via CLI and env,
causing 'will be overwritten' warnings on every child spawn
* add common/preset.cpp
* fix compile
* cont
* allow custom-path models
* add falsey check
* server: fix router model discovery and child process spawning
- Sanitize model names: replace / and \ with _ for display
- Recursive directory scan with relative path storage
- Convert relative paths to absolute when spawning children
- Filter router control args from child processes
- Refresh args after port assignment for correct port value
- Fallback preset lookup for compatibility
- Fix missing argv[0]: store server binary path before base_args parsing
* Revert "server: fix router model discovery and child process spawning"
This reverts commit e3832b42eeea7fcb108995966c7584479f745857.
* clarify about "no-" prefix
* correct render_args() to include binary path
* also remove arg LLAMA_ARG_MODELS_PRESET for child
* add co-author for ini parser code
Co-authored-by: aldehir <hello@alde.dev>
* also set LLAMA_ARG_HOST
* add CHILD_ADDR
* Remove dead code
---------
Co-authored-by: aldehir <aldehir@users.noreply.github.com>
Co-authored-by: ngxson <ngxson@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
Co-authored-by: aldehir <hello@alde.dev>
* tests: update barrier test to check for race condition in active threads
* cpu: combine n_graph and n_threads into a single atomic update
* tests: add multi-graph test for test_barrier