Installation

February 25, 2026 · View on GitHub

Here is information on how to install yzma.

First, install the yzma command line tool. Then use the yzma command to install the llama.cpp libraries on your machine.

Once you have installed the llama.cpp libraries, you can run your Go programs that use yzma. See the examples directory.

You can also use the yzma command to download models on your machine. See the MODELS.md page for information.

Install `yzma` command

The first step is to install the yzma command line tool. You can then use it to install the llama.cpp libraries for your platform.

go install github.com/hybridgroup/yzma@latest

For more info, see the yzma command documentation.

Install `llama.cpp` libraries

Now, using the yzma command, you can install the llama.cpp libraries. Follow the instructions for your system:

macOS
Linux - CPU
Linux - CUDA
Linux - ROCm
Linux - Vulkan
Arduino UNO Q
NVIDIA Jetson Orin
Raspberry Pi 4/5
Windows - CPU
Windows - CUDA
Windows - Vulkan

macOS

Decide where you want put the files for your local installation, then run the following command:

yzma install --lib /path/to/lib

To complete your installation, follow any specific instructions for your operating system displayed by the results of the yzma install command.

Now try running one of the example programs!

Linux CPU

Decide where you want put the files for your local installation, then run the following command:

yzma install --lib /path/to/lib

To complete your installation, follow any specific instructions for your operating system displayed by the results of the yzma install command.

Linux CUDA

If you want to use a GPU with CUDA on a Linux machine, you will need to install the CUDA drivers.

See https://docs.nvidia.com/cuda/cuda-installation-guide-linux/

Once that is complete, decide where you want put the files for your local installation, then run the following command:

yzma install --lib /path/to/lib --processor cuda

To complete your installation, follow any specific instructions for your operating system displayed by the results of the yzma install command.

Linux ROCm

If you want to use an AMD GPU with ROCm on a Linux machine, you will need to install the ROCm 7.2 drivers and runtime.

Prerequisites

An AMD GPU listed in AMD's supported GPUs table (such as AMD Instinct or supported Radeon GPUs)
A ROCm 7.2 supported Linux distribution (Ubuntu 24.04 and 22.04 are the most common choices)
A compatible AMDGPU kernel driver — see AMD's driver installation instructions if your system does not already have one

Install ROCm 7.2

Ubuntu 24.04

wget https://repo.radeon.com/amdgpu-install/7.2/ubuntu/noble/amdgpu-install_7.2.70200-1_all.deb
sudo apt install ./amdgpu-install_7.2.70200-1_all.deb
sudo apt update
sudo apt install python3-setuptools python3-wheel
sudo usermod -a -G render,video $LOGNAME
sudo apt install rocm

Ubuntu 22.04

wget https://repo.radeon.com/amdgpu-install/7.2/ubuntu/jammy/amdgpu-install_7.2.70200-1_all.deb
sudo apt install ./amdgpu-install_7.2.70200-1_all.deb
sudo apt update
sudo apt install python3-setuptools python3-wheel
sudo usermod -a -G render,video $LOGNAME
sudo apt install rocm

Reboot your system after installing ROCm to apply all settings (the render and video group membership requires at least a log out/in to take effect).

You can verify the installation by running:

rocminfo

For other supported Linux distributions, see the ROCm installation guide.

Install yzma with ROCm

Once ROCm is installed, decide where you want put the files for your local installation, then run the following command:

yzma install --lib /path/to/lib --processor rocm

Note: if ROCm is already installed, yzma can auto-detect it. You can simply run:

yzma install --lib /path/to/lib

To complete your installation, follow any specific instructions for your operating system displayed by the results of the yzma install command.

Linux Vulkan

To use Vulkan on your Linux system, your will also need to install the Vulkan drivers. For example:

sudo apt install -y mesa-vulkan-drivers vulkan-tools

Once that is complete, decide where you want put the files for your local installation, then run the following command:

yzma install --lib /path/to/lib --processor vulkan

To complete your installation, follow any specific instructions for your operating system displayed by the results of the yzma install command.

NVIDIA Jetson Orin

To the GPU on your NVIDIA Jetson Orin you should install the latest version of the Jetpack software for your device.

CUDA

Decide where you want put the files for your local installation, then run the following command:

yzma install --lib /path/to/lib --processor cuda

To complete your installation, follow any specific instructions for your operating system displayed by the results of the yzma install command.

Vulkan

To use Vulkan with the GPU on your Jetson Orin, you will also need to also update the GLIBC shared libraries:

sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt-get update
sudo apt-get install --only-upgrade libstdc++6

Once that is complete, decide where you want put the files for your local installation, then run the following command:

yzma install --lib /path/to/lib --processor vulkan

To complete your installation, follow any specific instructions for your operating system displayed by the results of the yzma install command.

Raspberry Pi

You can run yzma on a Raspberry Pi 4 or 5.

Raspberry Pi OS (64-bit)

If you are running the latest version of the Raspberry Pi OS, decide where you want put the files for your local installation, then run the following command:

yzma install --lib /path/to/lib --processor cpu --os trixie

To complete your installation, follow any specific instructions for your operating system displayed by the results of the yzma install command.

Raspberry Pi OS (Legacy, 64-bit)

If you are running an older version of the Raspberry Pi OS, decide where you want put the files for your local installation, then run the following command:

yzma install --lib /path/to/lib --processor cpu --os bookworm

To complete your installation, follow any specific instructions for your operating system displayed by the results of the yzma install command.

Arduino UNO Q

You can run yzma on a Arduino UNO Q board.

yzma install --lib /path/to/lib --processor cpu --os trixie

Windows CPU

Decide where you want put the files for your local installation, then run the following command:

If you have an Nvidia card, use:

yzma install --lib /path/to/lib --processor cuda

If you have an AMD card, use:

yzma install --lib /path/to/lib --processor rocm

To complete your installation, follow any specific instructions for your operating system displayed by the results of the yzma install command.

Windows CUDA

If you want to use a GPU on your Windows machine, you will need to install the CUDA drivers.

See https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/

Decide where you want put the files for your local installation, then run the following command:

yzma install --lib /path/to/lib --processor cuda

To complete your installation, follow any specific instructions for your operating system displayed by the results of the yzma install command.

Windows Vulkan

To use Vulkan, you will need to install the Vulkan SDK.

https://vulkan.lunarg.com/doc/sdk/latest/windows/getting_started.html

Decide where you want put the files for your local installation, then run the following command:

yzma install --lib /path/to/lib --processor vulkan

To complete your installation, follow any specific instructions for your operating system displayed by the results of the yzma install command.

Next steps

Now the installation is complete. Try running one of the example programs!

Manual installation

If you prefer a manual installation, you can obtain most of the prebuilt llama.cpp binaries from here:

https://github.com/ggml-org/llama.cpp/releases

We also have binaries available for Ubuntu CUDA and Vulkan for arm64 located here:

https://github.com/hybridgroup/llama-cpp-builder/releases

Installing the prebuilt binaries (manual)

If you do not use the yzma installer, you must download and extract the library files into a directory on your local machine.

Linux

For Linux, they have the .so file extension. For example, libllama.so, libmtmd.so and so on.

Important Note You currently need to set the YZMA_LIB env variable to the directory with your llama.cpp library files. For example:

export YZMA_LIB=/home/ron/Development/yzma/lib

For macOS, the llama.cpp binaries have a .dylib file extension. For example, libllama.dylib, libmtmd.dylib and so on. You do not need the other downloaded files to use the llama.cpp libraries with yzma.

Important Note You currently need to set the YZMA_LIB env variable to the directory with your llama.cpp library files. For example:

export YZMA_LIB=/home/ron/Development/yzma/lib

Windows

On Windows, the llama.cpp binaries have the .dll file extension. For example, llama.dll, mtmd.dll and so on.

You will also need to download the cudart files from the same location as the other llama.cpp libraries when using CUDA on Windows.

Important Note You currently need to set the YZMA_LIB env variable to the directory with your llama.cpp library files. For example:

set YZMA_LIB=C:\yzma\lib

Programmatic Installation

Want to use Go code to install the llama.cpp precompiled binaries from within your own application? We have the download package for that!

Check out the installer example code.