CANN: update docker images to 8.5.0 and improve CANN.md (#20801)
* cann: update docker images to 8.5.0 - bump CANN base image from 8.3.rc2 to 8.5.0 - bump ASCEND_VERSION from 8.1.RC1.alpha001 to 8.5.0 Move to newer stable releases. * cann: update CANN.md * Update CANN.md to include BF16 support Added BF16 support information to the CANN documentation and corrected formatting for the installation instructions. * Fix formatting issues in CANN.md Fix 234: Trailing whitespace
This commit is contained in:
parent
1743d98057
commit
6861f6509a
5 changed files with 82 additions and 58 deletions
|
|
@ -4,7 +4,7 @@
|
||||||
|
|
||||||
# Define the CANN base image for easier version updates later
|
# Define the CANN base image for easier version updates later
|
||||||
ARG CHIP_TYPE=910b
|
ARG CHIP_TYPE=910b
|
||||||
ARG CANN_BASE_IMAGE=quay.io/ascend/cann:8.3.rc2-${CHIP_TYPE}-openeuler24.03-py3.11
|
ARG CANN_BASE_IMAGE=quay.io/ascend/cann:8.5.0-${CHIP_TYPE}-openeuler24.03-py3.11
|
||||||
|
|
||||||
# ==============================================================================
|
# ==============================================================================
|
||||||
# BUILD STAGE
|
# BUILD STAGE
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,4 @@
|
||||||
ARG ASCEND_VERSION=8.1.RC1.alpha001-910b-openeuler22.03-py3.10
|
ARG ASCEND_VERSION=8.5.0-910b-openeuler22.03-py3.10
|
||||||
|
|
||||||
FROM ascendai/cann:$ASCEND_VERSION AS build
|
FROM ascendai/cann:$ASCEND_VERSION AS build
|
||||||
|
|
||||||
|
|
|
||||||
2
.github/workflows/build-cann.yml
vendored
2
.github/workflows/build-cann.yml
vendored
|
|
@ -63,7 +63,7 @@ jobs:
|
||||||
- name: Set container image
|
- name: Set container image
|
||||||
id: cann-image
|
id: cann-image
|
||||||
run: |
|
run: |
|
||||||
image="ascendai/cann:${{ matrix.chip_type == '910b' && '8.3.rc2-910b-openeuler24.03-py3.11' || '8.3.rc2-310p-openeuler24.03-py3.11' }}"
|
image="ascendai/cann:${{ matrix.chip_type == '910b' && '8.5.0-910b-openeuler24.03-py3.11' || '8.5.0-310p-openeuler24.03-py3.11' }}"
|
||||||
echo "image=${image}" >> "${GITHUB_OUTPUT}"
|
echo "image=${image}" >> "${GITHUB_OUTPUT}"
|
||||||
|
|
||||||
- name: Pull container image
|
- name: Pull container image
|
||||||
|
|
|
||||||
2
.github/workflows/release.yml
vendored
2
.github/workflows/release.yml
vendored
|
|
@ -907,7 +907,7 @@ jobs:
|
||||||
- name: Set container image
|
- name: Set container image
|
||||||
id: cann-image
|
id: cann-image
|
||||||
run: |
|
run: |
|
||||||
image="ascendai/cann:${{ matrix.chip_type == '910b' && '8.3.rc2-910b-openeuler24.03-py3.11' || '8.3.rc2-310p-openeuler24.03-py3.11' }}"
|
image="ascendai/cann:${{ matrix.chip_type == '910b' && '8.5.0-910b-openeuler24.03-py3.11' || '8.5.0-310p-openeuler24.03-py3.11' }}"
|
||||||
echo "image=${image}" >> "${GITHUB_OUTPUT}"
|
echo "image=${image}" >> "${GITHUB_OUTPUT}"
|
||||||
|
|
||||||
- name: Pull container image
|
- name: Pull container image
|
||||||
|
|
|
||||||
|
|
@ -42,12 +42,22 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi
|
||||||
|
|
||||||
### Ascend NPU
|
### Ascend NPU
|
||||||
|
|
||||||
**Verified devices**
|
You can retrieve your Ascend device IDs using the following command:
|
||||||
|
|
||||||
| Ascend NPU | Status |
|
```sh
|
||||||
|:-----------------------------:|:-------:|
|
lspci -n | grep -Eo '19e5:d[0-9a-f]{3}' | cut -d: -f2
|
||||||
| Atlas 300T A2 | Support |
|
```
|
||||||
| Atlas 300I Duo | Support |
|
|
||||||
|
**Devices**
|
||||||
|
|
||||||
|
| Device Id | Product Series | Product Models | Chip Model | Verified Status |
|
||||||
|
|:---------:|----------------|----------------|:----------:|:---------------:|
|
||||||
|
| d803 | Atlas A3 Train | | 910C | |
|
||||||
|
| d803 | Atlas A3 Infer | | 910C | |
|
||||||
|
| d802 | Atlas A2 Train | | 910B | |
|
||||||
|
| d802 | Atlas A2 Infer | Atlas 300I A2 | 910B | Support |
|
||||||
|
| d801 | Atlas Train | | 910 | |
|
||||||
|
| d500 | Atlas Infer | Atlas 300I Duo | 310P | Support |
|
||||||
|
|
||||||
*Notes:*
|
*Notes:*
|
||||||
|
|
||||||
|
|
@ -57,6 +67,9 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi
|
||||||
|
|
||||||
## Model Supports
|
## Model Supports
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary>Text-only</summary>
|
||||||
|
|
||||||
| Model Name | FP16 | Q4_0 | Q8_0 |
|
| Model Name | FP16 | Q4_0 | Q8_0 |
|
||||||
|:----------------------------|:-----:|:----:|:----:|
|
|:----------------------------|:-----:|:----:|:----:|
|
||||||
| Llama-2 | √ | √ | √ |
|
| Llama-2 | √ | √ | √ |
|
||||||
|
|
@ -118,8 +131,11 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi
|
||||||
| Trillion-7B-preview | √ | √ | √ |
|
| Trillion-7B-preview | √ | √ | √ |
|
||||||
| Ling models | √ | √ | √ |
|
| Ling models | √ | √ | √ |
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary>Multimodal</summary>
|
||||||
|
|
||||||
**Multimodal**
|
|
||||||
| Model Name | FP16 | Q4_0 | Q8_0 |
|
| Model Name | FP16 | Q4_0 | Q8_0 |
|
||||||
|:----------------------------|:-----:|:----:|:----:|
|
|:----------------------------|:-----:|:----:|:----:|
|
||||||
| LLaVA 1.5 models, LLaVA 1.6 models | x | x | x |
|
| LLaVA 1.5 models, LLaVA 1.6 models | x | x | x |
|
||||||
|
|
@ -134,15 +150,22 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi
|
||||||
| GLM-EDGE | √ | √ | √ |
|
| GLM-EDGE | √ | √ | √ |
|
||||||
| Qwen2-VL | √ | √ | √ |
|
| Qwen2-VL | √ | √ | √ |
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## DataType Supports
|
## DataType Supports
|
||||||
|
|
||||||
| DataType | Status |
|
| DataType | 910B | 310P |
|
||||||
|:----------------------:|:-------:|
|
|:----------------------:|:-------:|:-------:|
|
||||||
| FP16 | Support |
|
| FP16 | Support | Support |
|
||||||
| Q8_0 | Support |
|
| Q8_0 | Support | Partial |
|
||||||
| Q4_0 | Support |
|
| Q4_0 | Support | Partial |
|
||||||
|
| BF16 | Support | |
|
||||||
|
|
||||||
|
> **310P note**
|
||||||
|
> - `Q8_0`: data transform / buffer path is implemented, and `GET_ROWS` is supported, but quantized `MUL_MAT` / `MUL_MAT_ID` are not supported.
|
||||||
|
> - `Q4_0`: data transform / buffer path is implemented, but quantized `MUL_MAT` / `MUL_MAT_ID` are not supported.
|
||||||
|
|
||||||
## Docker
|
## Docker
|
||||||
|
|
||||||
|
|
@ -160,7 +183,20 @@ npu-smi info
|
||||||
|
|
||||||
# Select the cards that you want to use, make sure these cards are not used by someone.
|
# Select the cards that you want to use, make sure these cards are not used by someone.
|
||||||
# Following using cards of device0.
|
# Following using cards of device0.
|
||||||
docker run --name llamacpp --device /dev/davinci0 --device /dev/davinci_manager --device /dev/devmm_svm --device /dev/hisi_hdc -v /usr/local/dcmi:/usr/local/dcmi -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info -v /PATH_TO_YOUR_MODELS/:/app/models -it llama-cpp-cann -m /app/models/MODEL_PATH -ngl 32 -p "Building a website can be done in 10 simple steps:"
|
docker run --name llamacpp \
|
||||||
|
--device /dev/davinci0 \
|
||||||
|
--device /dev/davinci_manager \
|
||||||
|
--device /dev/devmm_svm \
|
||||||
|
--device /dev/hisi_hdc \
|
||||||
|
-v /usr/local/dcmi:/usr/local/dcmi \
|
||||||
|
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
|
||||||
|
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
|
||||||
|
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
|
||||||
|
-v /PATH_TO_YOUR_MODELS/:/app/models \
|
||||||
|
-it llama-cpp-cann \
|
||||||
|
-m /app/models/MODEL_PATH \
|
||||||
|
-ngl 32 \
|
||||||
|
-p "Building a website can be done in 10 simple steps:"
|
||||||
```
|
```
|
||||||
|
|
||||||
*Notes:*
|
*Notes:*
|
||||||
|
|
@ -171,69 +207,57 @@ docker run --name llamacpp --device /dev/davinci0 --device /dev/davinci_manager
|
||||||
|
|
||||||
### I. Setup Environment
|
### I. Setup Environment
|
||||||
|
|
||||||
1. **Install Ascend Driver and firmware**
|
1. **Configure Ascend user and group**
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
# create driver running user.
|
sudo groupadd HwHiAiUser
|
||||||
sudo groupadd -g HwHiAiUser
|
|
||||||
sudo useradd -g HwHiAiUser -d /home/HwHiAiUser -m HwHiAiUser -s /bin/bash
|
sudo useradd -g HwHiAiUser -d /home/HwHiAiUser -m HwHiAiUser -s /bin/bash
|
||||||
sudo usermod -aG HwHiAiUser $USER
|
sudo usermod -aG HwHiAiUser $USER
|
||||||
|
|
||||||
# download driver from https://www.hiascend.com/hardware/firmware-drivers/community according to your system
|
|
||||||
# and install driver.
|
|
||||||
sudo sh Ascend-hdk-910b-npu-driver_x.x.x_linux-{arch}.run --full --install-for-all
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Once installed, run `npu-smi info` to check whether driver is installed successfully.
|
2. **Install dependencies**
|
||||||
|
|
||||||
|
**Ubuntu/Debian:**
|
||||||
```sh
|
```sh
|
||||||
+-------------------------------------------------------------------------------------------+
|
sudo apt-get update
|
||||||
| npu-smi 24.1.rc2 Version: 24.1.rc2 |
|
sudo apt-get install -y gcc python3 python3-pip linux-headers-$(uname -r)
|
||||||
+----------------------+---------------+----------------------------------------------------+
|
|
||||||
| NPU Name | Health | Power(W) Temp(C) Hugepages-Usage(page)|
|
|
||||||
| Chip | Bus-Id | AICore(%) Memory-Usage(MB) HBM-Usage(MB) |
|
|
||||||
+======================+===============+====================================================+
|
|
||||||
| 2 xxx | OK | 64.4 51 15 / 15 |
|
|
||||||
| 0 | 0000:01:00.0 | 0 1873 / 15077 0 / 32768 |
|
|
||||||
+======================+===============+====================================================+
|
|
||||||
| 5 xxx | OK | 64.0 52 15 / 15 |
|
|
||||||
| 0 | 0000:81:00.0 | 0 1874 / 15077 0 / 32768 |
|
|
||||||
+======================+===============+====================================================+
|
|
||||||
| No running processes found in NPU 2 |
|
|
||||||
+======================+===============+====================================================+
|
|
||||||
| No running processes found in NPU 5 |
|
|
||||||
+======================+===============+====================================================+
|
|
||||||
```
|
```
|
||||||
|
|
||||||
2. **Install Ascend Firmware**
|
**RHEL/CentOS:**
|
||||||
```sh
|
```sh
|
||||||
# download driver from https://www.hiascend.com/hardware/firmware-drivers/community according to your system
|
sudo yum makecache
|
||||||
# and install driver.
|
sudo yum install -y gcc python3 python3-pip kernel-headers-$(uname -r) kernel-devel-$(uname -r)
|
||||||
sudo sh Ascend-hdk-910b-npu-firmware_x.x.x.x.X.run --full
|
|
||||||
```
|
```
|
||||||
If the following message appears, firmware is installed successfully.
|
|
||||||
|
3. **Install CANN (driver + toolkit)**
|
||||||
|
|
||||||
|
> The `Ascend-cann` package includes both the driver and toolkit.
|
||||||
|
> `$ARCH` can be `x86_64` or `aarch64`, `$CHIP` can be `910b` or `310p`.
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
Firmware package installed successfully!
|
wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/CANN/CANN%208.5.T63/Ascend-cann_8.5.0_linux-$ARCH.run
|
||||||
|
sudo bash ./Ascend-cann_8.5.0_linux-$ARCH.run --install
|
||||||
|
|
||||||
|
wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/CANN/CANN%208.5.T63/Ascend-cann-$CHIP-ops_8.5.0_linux-$ARCH.run
|
||||||
|
sudo bash ./Ascend-cann-$CHIP-ops_8.5.0_linux-$ARCH.run --install
|
||||||
```
|
```
|
||||||
|
|
||||||
|
4. **Verify installation**
|
||||||
|
|
||||||
3. **Install CANN toolkit and kernels**
|
|
||||||
|
|
||||||
CANN toolkit and kernels can be obtained from the official [CANN Toolkit](https://www.hiascend.com/zh/developer/download/community/result?module=cann) page.
|
|
||||||
|
|
||||||
Please download the corresponding version that satified your system. The minimum version required is 8.0.RC2.alpha002 and here is the install command.
|
|
||||||
```sh
|
```sh
|
||||||
pip3 install attrs numpy decorator sympy cffi pyyaml pathlib2 psutil protobuf scipy requests absl-py wheel typing_extensions
|
npu-smi info
|
||||||
sh Ascend-cann-toolkit_8.0.RC2.alpha002_linux-aarch64.run --install
|
|
||||||
sh Ascend-cann-kernels-910b_8.0.RC2.alpha002_linux.run --install
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Set Ascend Variables:
|
If device information is displayed correctly, the driver is functioning properly.
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
echo "source ~/Ascend/ascend-toolkit/set_env.sh" >> ~/.bashrc
|
# Set environment variables (adjust path if needed)
|
||||||
source ~/.bashrc
|
source /usr/local/Ascend/cann/set_env.sh
|
||||||
|
|
||||||
|
python3 -c "import acl; print(acl.get_soc_name())"
|
||||||
```
|
```
|
||||||
|
|
||||||
Upon a successful installation, CANN is enabled for the available ascend devices.
|
If the command outputs the chip model, the installation was successful.
|
||||||
|
|
||||||
### II. Build llama.cpp
|
### II. Build llama.cpp
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue