SLM Lab

March 4, 2026 · View on GitHub

Modular Deep Reinforcement Learning framework in PyTorch.
Companion library of the book Foundations of Deep Reinforcement Learning.
Documentation · Benchmark Results

NOTE: v5.0 updates to Gymnasium, uv tooling, and modern dependencies with ARM support - see CHANGELOG.md.

Book readers: git checkout v4.1.1 for Foundations of Deep Reinforcement Learning code.

ppo beamriderppo breakoutppo kungfumasterppo mspacman
BeamRiderBreakoutKungFuMasterMsPacman
ppo pongppo qbertppo seaquestppo spaceinvaders
PongQbertSeaquestSp.Invaders
sac antsac halfcheetahsac hoppersac humanoid
AntHalfCheetahHopperHumanoid
sac doublependulumsac pendulumsac reachersac walker
Inv.DoublePendulumInvertedPendulumReacherWalker

SLM Lab is a software framework for reinforcement learning (RL) research and application in PyTorch. RL trains agents to make decisions by learning from trial and error—like teaching a robot to walk or an AI to play games.

What SLM Lab Offers

FeatureDescription
Ready-to-use algorithmsPPO, SAC, CrossQ, DQN, A2C, REINFORCE—validated on 70+ environments
Easy configurationJSON spec files fully define experiments—no code changes needed
ReproducibilityEvery run saves its spec + git SHA for exact reproduction
Automatic analysisTraining curves, metrics, and TensorBoard logging out of the box
Cloud integrationdstack for GPU training, HuggingFace for sharing results

Algorithms

AlgorithmTypeBest ForValidated Environments
REINFORCEOn-policyLearning/teachingClassic
SARSAOn-policyTabular-likeClassic
DQN/DDQN+PEROff-policyDiscrete actionsClassic, Box2D, Atari
A2COn-policyFast iterationClassic, Box2D, Atari
PPOOn-policyGeneral purposeClassic, Box2D, MuJoCo (11), Atari (54)
SACOff-policyContinuous controlClassic, Box2D, MuJoCo
CrossQOff-policySample-efficient controlClassic, Box2D, MuJoCo

See Benchmark Results for detailed performance data.

Environments

SLM Lab uses Gymnasium (the maintained fork of OpenAI Gym):

CategoryExamplesDifficultyDocs
Classic ControlCartPole, Pendulum, AcrobotEasyGymnasium Classic
Box2DLunarLander, BipedalWalkerMediumGymnasium Box2D
MuJoCoHopper, HalfCheetah, HumanoidHardGymnasium MuJoCo
AtariBreakout, MsPacman, and 54 moreVariedALE

Any gymnasium-compatible environment works—just specify its name in the spec.

Quick Start

# Install
uv sync
uv tool install --editable .

# Run demo (PPO CartPole)
slm-lab run                                    # PPO CartPole
slm-lab run --render                           # with visualization

# Run custom experiment
slm-lab run spec.json spec_name train          # local training
slm-lab run-remote spec.json spec_name train   # cloud training (dstack)

# Help (CLI uses Typer)
slm-lab --help                                 # list all commands
slm-lab run --help                             # options for run command

# Troubleshoot: if slm-lab not found, use uv run
uv run slm-lab run

Cloud Training (dstack)

Run experiments on cloud GPUs with automatic result sync to HuggingFace.

# Setup
cp .env.example .env  # Add HF_TOKEN
uv tool install dstack  # Install dstack CLI
# Configure dstack server - see https://dstack.ai/docs/quickstart

# Run on cloud
slm-lab run-remote spec.json spec_name train           # CPU training (default)
slm-lab run-remote spec.json spec_name search          # CPU ASHA search (default)
slm-lab run-remote --gpu spec.json spec_name train     # GPU training (for image envs)

# Sync results
slm-lab pull spec_name    # Download from HuggingFace
slm-lab list              # List available experiments

Config options in .dstack/: run-gpu-train.yml, run-gpu-search.yml, run-cpu-train.yml, run-cpu-search.yml

Minimal Install (Orchestration Only)

For a lightweight box that only dispatches dstack runs, syncs results, and generates plots (no local ML training):

uv sync --no-default-groups  # skip ML deps (torch, gymnasium, etc.)
uv tool install dstack
uv run --no-default-groups slm-lab run-remote spec.json spec_name train
uv run --no-default-groups slm-lab pull spec_name
uv run --no-default-groups slm-lab plot -f folder1,folder2

Citation

If you use SLM Lab in your research, please cite:

@misc{kenggraesser2017slmlab,
    author = {Keng, Wah Loon and Graesser, Laura},
    title = {SLM Lab},
    year = {2017},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/kengz/SLM-Lab}},
}

License

MIT