Conda

November 11, 2025 Β· View on GitHub

Official PyTorch implementation of the paper:
β€œConda: Column-Normalized Adam for Training Large Language Models Faster”


πŸ“₯ Installation

git clone https://github.com/jie040109/Conda.git
cd Conda
pip install -e .

πŸš€ Examples

This repository includes two example training setups using conda_torch:

  • examples/gpt2/ β€” GPT-2 pre-training on Openwebtext
  • examples/llama/ β€” LLaMA pre-training on C4

Below are the exact steps to reproduce both examples.


βœ… 1. LLaMA

Step 1 β€” Install dependencies

cd examples/llama
conda create -n llama python=3.10
conda activate llama
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

Step 2 β€” Prepare C4 datasets

bash download_c4.sh 

Step 3 β€” Conda for LLaMA pre-training

# llama-60m
bash scripts/llama_60m_conda.sh
# llama-130m
bash scripts/llama_130m_conda.sh
# llama-350m
bash scripts/llama_350m_conda.sh
# llama-1b
bash scripts/llama_1b_conda.sh

Step 4 β€” Other optimizers for LLaMA pre-training

Scripts for alternative optimizers (AdamW, Muon, SOAP, Adafactor) are located in:

examples/llama/scripts/

Run them in a similar manner, eg.

bash scripts/llama_60m_muon.sh

βœ… 2. GPT-2

Step 1 β€” Install dependencies

cd examples/gpt2
conda env create -f environment.yml
conda activate gpt2

Step 2 β€” Prepare Openwebtext datasets

python data/openwebtext/prepare.py

Step 3 β€” Conda for GPT-2 pre-training

# gpt2-125m
bash scripts/train_gpt2_125m_conda.sh
# gpt2-355m
bash scripts/train_gpt2_355m_conda.sh

Step 4 β€” Other optimizers for GPT-2 pre-training

Scripts for alternative optimizers (AdamW, Muon, SOAP, Adafactor) are located in:

examples/gpt2/scripts/

Run them in a similar manner, eg.

bash scripts/train_gpt2_125m_muon.sh