- Accurate memory calculation using ggml quantization formulas - Support for f32, f16, bf16, q8_0, q4_0, q4_1, iq4_nl, q5_0, q5_1 quantizations - Asymmetric context support (separate K/V cache quantization) - Full attention interval support - Parallel sequences multiplier - Bilingual interface (Russian/English) - Retro-style design with tooltips |
||
|---|---|---|
| .. | ||
| styles.css | ||