20240225-alt1

- Update to b2259 (2024-02-25).
This commit is contained in:
Vitaly Chikunov 2024-02-26 01:19:27 +03:00
parent 0353295019
commit be3acdbeaa

View file

@ -4,7 +4,7 @@
%set_verify_elf_method strict
Name: llama.cpp
Version: 20231019
Version: 20240225
Release: alt1
Summary: Inference of LLaMA model in pure C/C++
License: MIT
@ -30,28 +30,23 @@ BuildRequires: gcc-c++
%description
Plain C/C++ implementation (of inference of LLaMA model) without
dependencies. AVX, AVX2 and AVX512 support for x86 architectures.
Mixed F16/F32 precision. 2-bit, 3-bit, 4-bit, 5-bit, 6-bit and 8-bit
integer quantization support. Runs on the CPU. Supported models:
Mixed F16/F32 precision. 1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and
8-bit integer quantization for faster inference and reduced memory use.
Runs on the CPU.
LLaMA
LLaMA 2
Alpaca
GPT4All
Chinese LLaMA / Alpaca
Vigogne (French)
Vicuna
Koala
OpenBuddy (Multilingual)
Pygmalion / Metharme
WizardLM
Baichuan 1 & 2 + derivations
Aquila 1 & 2
Starcoder models
Mistral AI v0.1
Refact
Persimmon 8B
MPT
Bloom
Supported models:
LLaMA, LLaMA 2, Mistral 7B, Mixtral MoE, Falcon, Chinese LLaMA /
Alpaca and Chinese LLaMA-2 / Alpaca-2, Vigogne (French), Koala,
Baichuan 1 & 2 + derivations, Aquila 1 & 2, Starcoder models, Refact,
Persimmon 8B, MPT, Bloom, Yi models, StableLM models, Deepseek models,
Qwen models, PLaMo-13B, Phi models, GPT-2, Orion 14B, InternLM2,
CodeShell, Gemma
Multimodal models:
LLaVA 1.5 models, BakLLaVA, Obsidian, ShareGPT4V, MobileVLM 1.7B/3B
models, Yi-VL
NOTE 1: You will need to:
@ -65,7 +60,7 @@ NOTE 2:
For example, LLaMA downloaded via public torrent link is 220 GB.
Overall this is all raw and experimental, no warranty, no support.
Overall this is all raw and EXPERIMENTAL, no warranty, no support.
%prep
%setup
@ -90,20 +85,50 @@ cp -rp grammars -t %buildroot%_datadir/%name
install -Dp examples/*.sh -t %buildroot%_datadir/%name/examples
# Install and rename binaries to have llama- prefix.
cd %_cmake__builddir/bin
find -maxdepth 1 -type f -executable -printf '%f\0' |
find -maxdepth 1 -type f -executable -not -name 'test-*' -printf '%f\0' |
xargs -0ti -n1 install -p {} %buildroot%_bindir/llama-{}
mkdir -p %buildroot%_unitdir
cat <<'EOF' >%buildroot%_unitdir/llama.service
[Unit]
Description=Llama.cpp server, CPU only (no GPU support in this build).
After=syslog.target network.target local-fs.target remote-fs.target nss-lookup.target
[Service]
Type=simple
DynamicUser=true
EnvironmentFile=%_sysconfdir/sysconfig/llama
ExecStart=%_bindir/llama-server $LLAMA_ARGS
ExecReload=/bin/kill -HUP $MAINPID
Restart=never
[Install]
WantedBy=default.target
EOF
mkdir -p %buildroot%_sysconfdir/sysconfig
cat <<EOF > %buildroot%_sysconfdir/sysconfig/llama
# Change to accessible path with a model.
LLAMA_ARGS="-m %_datadir/%name/ggml-model-f32.bin"
EOF
%check
%cmake_build --target test
%files
%define _customdocdir %_docdir/%name
%doc LICENSE README.md SHA256SUMS docs
%doc LICENSE README.md docs
%_bindir/llama-*
%_bindir/convert-*.py
%_unitdir/llama.service
%_sysconfdir/sysconfig/llama
%_datadir/%name
%changelog
* Mon Feb 26 2024 Vitaly Chikunov <vt@altlinux.org> 20240225-alt1
- Update to b2259 (2024-02-25).
* Fri Oct 20 2023 Vitaly Chikunov <vt@altlinux.org> 20231019-alt1
- Update to b1400 (2023-10-19).
- Install experimental converters (convert- prefixed tools).