llama.cpp (1-8659-alt1)

Published 2026-04-07 07:05:50 +03:00 by thek0tyara

Installation

# Add a repository to the list of connected repositories (choose the necessary architecture instead of "_arch_"):
apt-repo add rpm  _arch_ classic
# To install the package, run the following command:
apt-get update
apt-get install llama.cpp

Repository info

Architectures
x86_64

About this package

LLM inference in C/C++
Plain C/C++ implementation (of inference of many LLM models) without dependencies. AVX, AVX2, AVX512, and AMX support for x86 architectures. Mixed F16/F32 precision. 1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization for faster inference and reduced memory use. Supports CPU, GPU, and hybrid CPU+GPU inference. Supported models: LLaMA models, Mistral 7B, Mixtral MoE, Falcon, Chinese LLaMA / Alpaca and Chinese LLaMA-2 / Alpaca-2, Vigogne (French), Koala, Baichuan 1 & 2 + derivations, Aquila 1 & 2, Starcoder models, Refact, Persimmon 8B, MPT, Bloom, Yi models, StableLM models, Deepseek models, Qwen models, PLaMo-13B, Phi models, GPT-2, Orion 14B, InternLM2, CodeShell, Gemma, Mamba, Grok-1, Xverse, Command-R models, SEA-LION, GritLM-7B + GritLM-8x7B, OLMo, GPT-NeoX + Pythia, Snowflake-Arctic MoE, Smaug, Poro 34B, Bitnet b1.58 models, Flan T5, Open Elm models, ChatGLM3-6b + ChatGLM4-9b + GLMEdge-1.5b + GLMEdge-4b, SmolLM, EXAONE-3.0-7.8B-Instruct, FalconMamba Models, Jais, Bielik-11B-v2.3, RWKV-6, QRWKV-6, GigaChat-20B-A3B, Trillion-7B-preview, Ling models, LFM2 models, Hunyuan models, BailingMoeV2 (Ring/Ling 2.0) models Multimodal models: LLaVA 1.5 models, BakLLaVA, Obsidian, ShareGPT4V, MobileVLM 1.7B/3B models, Yi-VL, Mini CPM, Moondream, Bunny, GLM-EDGE, Qwen2-VL, LFM2-VL NOTE: MODELS ARE NOT PROVIDED. You'll need to download them from the original sites (or Hugging Face Hub). Overall this is all raw and EXPERIMENTAL, no warranty, no support.
Details
ALT
2026-04-07 07:05:50 +03:00
0
MIT
5.8 KiB
Assets (1)
Versions (1) View all
1-8659-alt1 2026-04-07