Installation
January 23, 2026 · View on GitHub
Requirements
The code is built and tested with Python 3.10, CUDA 12.8, and PyTorch 2.7.1.
Preparation
1. Clone the repository
git clone https://github.com/InternRobotics/InternVLA-A1.git
cd InternVLA-A1
2. Create Conda Environment
conda create -y -n internvla_a1 python=3.10
conda activate internvla_a1
pip install --upgrade pip
3. Install System Dependencies
We use FFmpeg for video encoding/decoding and SVT-AV1 for efficient storage.
conda install -c conda-forge ffmpeg=7.1.1 svt-av1 -y
4. Install PyTorch (CUDA 12.8)
pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 \
--index-url https://download.pytorch.org/whl/cu128
5. Install Python Dependencies
pip install torchcodec numpy scipy transformers==4.57.1 mediapy loguru pytest omegaconf
pip install -e .
6. Patch HuggingFace Transformers
We replace the default implementations of several model modules (e.g., π0, InternVLA_A1_3B, InternVLA_A1_2B) to support custom architectures for robot learning.
TRANSFORMERS_DIR=${CONDA_PREFIX}/lib/python3.10/site-packages/transformers/
cp -r src/lerobot/policies/pi0/transformers_replace/models ${TRANSFORMERS_DIR}
cp -r src/lerobot/policies/InternVLA_A1_3B/transformers_replace/models ${TRANSFORMERS_DIR}
cp -r src/lerobot/policies/InternVLA_A1_2B/transformers_replace/models ${TRANSFORMERS_DIR}
Make sure the target directory exists—otherwise create it manually.
7. Configure Environment Variables
export HF_TOKEN=your_token # for downloading hf models, tokenizers, or processors
export HF_HOME=path_to_huggingface # default: ~/.cache/huggingface
8. Link Local HuggingFace Cache
ln -s ${HF_HOME}/lerobot data
This allows the repo to access datasets via ./data/.