llama.cpp (1-8659-alt1)
Published 2026-04-07 07:05:50 +03:00 by thek0tyara
Installation
# Add a repository to the list of connected repositories (choose the necessary architecture instead of "_arch_"):
apt-repo add rpm _arch_ classic# To install the package, run the following command:
apt-get update
apt-get install llama.cpp
Repository info
Architectures |
x86_64 |
About this package
LLM inference in C/C++
Plain C/C++ implementation (of inference of many LLM models) without
dependencies. AVX, AVX2, AVX512, and AMX support for x86 architectures.
Mixed F16/F32 precision. 1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and
8-bit integer quantization for faster inference and reduced memory use.
Supports CPU, GPU, and hybrid CPU+GPU inference.
Supported models:
LLaMA models, Mistral 7B, Mixtral MoE, Falcon, Chinese LLaMA /
Alpaca and Chinese LLaMA-2 / Alpaca-2, Vigogne (French), Koala,
Baichuan 1 & 2 + derivations, Aquila 1 & 2, Starcoder models, Refact,
Persimmon 8B, MPT, Bloom, Yi models, StableLM models, Deepseek models,
Qwen models, PLaMo-13B, Phi models, GPT-2, Orion 14B, InternLM2,
CodeShell, Gemma, Mamba, Grok-1, Xverse, Command-R models, SEA-LION,
GritLM-7B + GritLM-8x7B, OLMo, GPT-NeoX + Pythia, Snowflake-Arctic
MoE, Smaug, Poro 34B, Bitnet b1.58 models, Flan T5, Open Elm models,
ChatGLM3-6b + ChatGLM4-9b + GLMEdge-1.5b + GLMEdge-4b, SmolLM,
EXAONE-3.0-7.8B-Instruct, FalconMamba Models, Jais, Bielik-11B-v2.3,
RWKV-6, QRWKV-6, GigaChat-20B-A3B, Trillion-7B-preview, Ling models,
LFM2 models, Hunyuan models, BailingMoeV2 (Ring/Ling 2.0) models
Multimodal models:
LLaVA 1.5 models, BakLLaVA, Obsidian, ShareGPT4V, MobileVLM 1.7B/3B
models, Yi-VL, Mini CPM, Moondream, Bunny, GLM-EDGE, Qwen2-VL,
LFM2-VL
NOTE:
MODELS ARE NOT PROVIDED. You'll need to download them from the original
sites (or Hugging Face Hub).
Overall this is all raw and EXPERIMENTAL, no warranty, no support.
Details
Assets (1)
Versions (1)
View all
llama.cpp-1-8659-alt1.x86_64.rpm
5.8 KiB
1-8659-alt1
2026-04-07