README.md
June 12, 2026 · View on GitHub
Flow Map Language Models:
One-step Language Modeling via Continuous Denoising
Chanhyuk Lee1, Jaehoon Yoo1, Manan Agarwal2, Sheel Shah2, Jerry Huang2,
Aditi Raghunathan2, Seunghoon Hong1, Nicholas M. Boffi†2, Jinwoo Kim†1
1KAIST 2Carnegie Mellon University †Equal advising
News
- [2026-05] Added huggingface links for the checkpoints.
- [2026-04] We released LM1B/OpenWebText checkpoints for FLM and FMLM.
TL;DR
We introduce Flow Language Model (FLM) and its flow-map distilled variant Flow Map Language Model (FMLM), enabling one-step parallel text generation through continuous denoising.
Overview
FLM applies the benefits of continuous image generation to discrete state spaces by encoding text as one-hot vectors and using flow matching to directly map noise to one-hot data. Unlike discrete diffusion, FLM gradually denoises all tokens in parallel with a deterministic sample-level ODE, allowing it to represent a superposition of sequences and avoid per-token ancestral sampling — a fundamental bottleneck for discrete diffusion in the few-step regime. We extend this to FMLM, where learns the flow map which is the direct solution operator of the flow, enabling one-step parallel language generation.
How to Run
Install Dependencies
pip install torch>=2.3.0
pip install -r requirements.txt
# Install flash-attn separately matching your python / torch version (see https://github.com/Dao-AILab/flash-attention/releases)
pip install flash-attn==2.8.3 --no-build-isolation
Our DiT backbone supports torch.compile with max-autotune for faster training. Enable it by setting the environment variable before running any script:
export DIT_USE_COMPILE=TRUE
With the option, we are able to train OpenWebText experiments with 512 batch size on 8 H100 (80GB VRAM), with local batch size of 32.
Training
Before running, update data.cache_dir in the scripts to point to your dataset location. If the directory is empty, the dataset will be automatically downloaded and preprocessed.
Set algo.teacher_path to your pre-trained FLM checkpoint before running FMLM distillation.
| Model | Dataset | Script |
|---|---|---|
| FLM | LM1B | scripts/train_lm1b_flm.sh |
| FMLM | LM1B | scripts/train_lm1b_fmlm_denoiser.sh |
| FLM | OpenWebText | scripts/train_owt_flm.sh |
| FMLM | OpenWebText | scripts/train_owt_fmlm_denoiser.sh |
Evaluation
Set CKPT_PATH in the script to your trained checkpoint before running.
| Model | Dataset | Script |
|---|---|---|
| FLM | LM1B | scripts/gen_ppl_lm1b_flm.sh |
| FMLM | LM1B | scripts/gen_ppl_lm1b_fmlm.sh |
| FLM | OpenWebText | scripts/gen_ppl_owt_flm.sh |
| FMLM | OpenWebText | scripts/gen_ppl_owt_fmlm.sh |
Checkpoints
Pretrained Checkpoints
Pretrained FLM and FMLM checkpoints are available at Google Drive or Huggingface.
| Model | Dataset | Checkpoint |
|---|---|---|
| FLM | LM1B | lm1b_flm.ckpt |
| FMLM | LM1B | lm1b_fmlm.ckpt |
| FLM | OpenWebText | owt_flm.ckpt |
| FMLM | OpenWebText | owt_fmlm.ckpt |
Set eval.checkpoint_path (or algo.teacher_path for distillation) to the downloaded checkpoint path when running evaluation or distillation scripts.
Baseline Checkpoints
Reproduced baseline checkpoints for LM1B are available at here.
For other checkpoints, mostly for OpenWebText, refer to Duo, SDTT, RDLM, di4c repositories.
Full results
FLM (Undistilled)
LM1B
|
OpenWebText
|
FMLM (Distilled)
LM1B
|
OpenWebText
|
BibTeX
@article{lee2026flow,
title={Flow Map Language Models: One-step Language Modeling via Continuous Denoising},
author={Chanhyuk Lee and Jaehoon Yoo and Manan Agarwal
and Sheel Shah and Jerry Huang
and Aditi Raghunathan and Seunghoon Hong
and Nicholas M. Boffi and Jinwoo Kim},
journal={arXiv preprint arXiv:2602.16813},
year={2026},
}
Acknowledgements
This repository is built upon the codebases of Duo and ReDi.