Setup Guide

January 27, 2026 ยท View on GitHub

System Requirements

  • NVIDIA GPUs with Ampere architecture (RTX 30 Series, A100) or newer
  • NVIDIA driver >=570.124.06 compatible with CUDA 12.8.1
  • Linux x86-64
  • glibc>=2.35 (e.g Ubuntu >=22.04)

Installation

Install git lfs:

sudo apt install git-lfs
git lfs install

Clone the repository:

git clone git@github.com:nvidia-cosmos/<repository_name>.git
cd <repository_name>
git lfs pull

Install one of the following environments:

Virtual Environment

Install system dependencies:

sudo apt update && sudo apt -y install curl ffmpeg libx11-dev tree wget
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.local/bin/env

Install the package into a new environment:

uv python install
uv sync --extra=cu128
source .venv/bin/activate

Or, install the package into the active environment (e.g. conda):

uv sync --extra=cu128 --active --inexact

CUDA Variants:

CUDA VersionArgumentsNotes
CUDA 12.8--extra cu128NVIDIA Driver
CUDA 13.0--extra cu130NVIDIA Driver

For DGX Spark and Jetson AGX, you must use CUDA 13.0.

Docker Container

Please make sure you have access to Docker on your machine and the NVIDIA Container Toolkit is installed.

Build the container:

# Ampere - Hopper
image_tag=$(docker build -f Dockerfile -q .)
# Blackwell
image_tag=$(docker build -f docker/nightly.Dockerfile -q .)

Run the container:

docker run -it --runtime=nvidia --ipc=host --rm -v .:/workspace -v /workspace/.venv -v /root/.cache:/root/.cache -e HF_TOKEN="$HF_TOKEN" $image_tag

Optional arguments:

  • --ipc=host: Use host system's shared memory, since parallel torchrun consumes a large amount of shared memory. If not allowed by security policy, increase --shm-size (documentation).
  • -v /root/.cache:/root/.cache: Mount host cache to avoid re-downloading cache entries.
  • -e HF_TOKEN="$HF_TOKEN": Set Hugging Face token to avoid re-authenticating.

If you get docker: Error response from daemon: unknown or invalid runtime name: nvidia, you need to configure docker:

sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Downloading Checkpoints

  1. Get a Hugging Face Access Token with Read permission
  2. Install Hugging Face CLI: uv tool install -U "huggingface_hub[cli]"
  3. Login: hf auth login
  4. Accept the NVIDIA Open Model License Agreement.

Checkpoints are automatically downloaded during inference and post-training. To modify the checkpoint cache location, set the HF_HOME environment variable.