Conda
November 11, 2025 Β· View on GitHub
Official PyTorch implementation of the paper:
βConda: Column-Normalized Adam for Training Large Language Models Fasterβ
π₯ Installation
git clone https://github.com/jie040109/Conda.git
cd Conda
pip install -e .
π Examples
This repository includes two example training setups using conda_torch:
examples/gpt2/β GPT-2 pre-training on Openwebtextexamples/llama/β LLaMA pre-training on C4
Below are the exact steps to reproduce both examples.
β 1. LLaMA
Step 1 β Install dependencies
cd examples/llama
conda create -n llama python=3.10
conda activate llama
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt
Step 2 β Prepare C4 datasets
bash download_c4.sh
Step 3 β Conda for LLaMA pre-training
# llama-60m
bash scripts/llama_60m_conda.sh
# llama-130m
bash scripts/llama_130m_conda.sh
# llama-350m
bash scripts/llama_350m_conda.sh
# llama-1b
bash scripts/llama_1b_conda.sh
Step 4 β Other optimizers for LLaMA pre-training
Scripts for alternative optimizers (AdamW, Muon, SOAP, Adafactor) are located in:
examples/llama/scripts/
Run them in a similar manner, eg.
bash scripts/llama_60m_muon.sh
β 2. GPT-2
Step 1 β Install dependencies
cd examples/gpt2
conda env create -f environment.yml
conda activate gpt2
Step 2 β Prepare Openwebtext datasets
python data/openwebtext/prepare.py
Step 3 β Conda for GPT-2 pre-training
# gpt2-125m
bash scripts/train_gpt2_125m_conda.sh
# gpt2-355m
bash scripts/train_gpt2_355m_conda.sh
Step 4 β Other optimizers for GPT-2 pre-training
Scripts for alternative optimizers (AdamW, Muon, SOAP, Adafactor) are located in:
examples/gpt2/scripts/
Run them in a similar manner, eg.
bash scripts/train_gpt2_125m_muon.sh