README.md
June 23, 2026 · View on GitHub
BrainFM
A Modality-agnostic Multi-task Foundation Model for Human Brain Imaging
[arXiv]
Contact: Peirong Liu (peirong@jhu.edu)
Department of Electrical and Computer Engineering,
Data Science and AI Institute,
Johns Hopkins University
BrainFM is a unified foundation model for multi-task brain MRI analysis, including anatomy synthesis, segmentation, registration, bias-field estimation, cortical distance maps, and more. This repository provides inference demos, a synthetic data generator, and training code.
Installation
Tested with Python 3.11.4, PyTorch 2.0.1, and CUDA 12.2.
conda create -n brainfm python=3.11
conda activate brainfm
git clone https://github.com/jhuldr/BrainFM
cd BrainFM
pip install -r requirements.txt
Optional: install the package in editable mode.
pip install -e .
Pretrained weights
Download pretrained weights from OneDrive and place them under ckp/. The inference demos expect:
ckp/brainfm_pretrained.pth
Quick start: inference
Run all commands from the repository root.
- Place test volumes in
test_data_folder/(or update the path in the script). - Run the demo:
python scripts/demo_test.py
Results are written to outs/test_results/. The script supports whole-volume and tiled inference on large volumes.
Feature extraction only
python scripts/demo_get_feature.py
Update img_path in the script to point at your input volume. This returns a 64-channel feature map from the pretrained encoder.
Synthetic data generator
Visualize or debug the training data pipeline:
python scripts/demo_generator.py
Generator configs live in cfgs/generator/. default.yaml defines shared settings; task-specific configs in train/ and test/ override those defaults.
Key generator settings:
| Setting | Description |
|---|---|
dataset_names | Datasets to sample from (paths in Generator/constants.py) |
dataset_option | default (BaseGen) or brain_id (BrainIDGen) |
mix_synth_prob | Probability of blending synthetic with real images |
task | Toggle individual training tasks on/off |
augmentation_steps | Augmentation pipeline per input type |
Dataset paths, augmentation functions, and per-modality processing steps are defined in Generator/constants.py. See BrainIDGen in Generator/datasets.py for a customized generator example.
Training
Training is launched from the repo root with a generator config and a trainer config:
python scripts/train.py \
cfgs/generator/train/brain_id.yaml \
cfgs/trainer/train/joint.yaml
SLURM helpers are provided in scripts/train.sh and scripts/test.sh. Update cluster-specific settings there before submitting jobs.
Trainer configs are under cfgs/trainer/. Defaults are in default_train.yaml and default_val.yaml; experiment configs in train/ override them.
Project structure
BrainFM/
├── assets/ # Figures for documentation
├── cfgs/ # Generator and trainer YAML configs
├── ckp/ # Pretrained checkpoints (not tracked)
├── files/ # Atlas and local test volumes (e.g. gca.mgz)
├── Generator/ # Data loading, augmentation, and synthesis
├── scripts/ # Demos, training, and evaluation entry points
├── ShapeID/ # Shape synthesis and PDE-based deformation
├── Trainer/ # Model, losses, training engine
└── utils/ # Config, logging, interpolation, and I/O helpers
Citation
@article{Liu_2025_BrainFM,
author = {Liu, Peirong and Puonti, Oula and Hu, Xiaoling and Gopinath, Karthik and Sorby-Adams, Annabel and Alexander, Daniel C. and Iglesias, Juan E.},
title = {A Modality-agnostic Multi-task Foundation Model for Human Brain Imaging},
booktitle = {arXiv preprint arXiv:2509.00549},
year = {2025},
}
Copyright
"A Modality-agnostic Multi-task Foundation Model for Human Brain Imaging" is a publication of The Johns Hopkins University and copyright © 2026 The Johns Hopkins University. All rights reserved.