nanodrr
March 8, 2026 · View on GitHub
A performance-oriented reimplementation of DiffDRR with the following improvements:
- Optimized, pure PyTorch implementation (~5× faster than
DiffDRRat baseline) - Modular design (freely swap subjects, extrinsics, and intrinsics during rendering)
- Compatibility with
torch.compileand mixed precision - Extensive type hints with
jaxtyping - Standard Python package structure managed with
uv
All projective geometry is implemented internally using the standard Hartley and Zisserman pinhole camera formulation.
Installation
Note
On pytorch<2.9, torch.compile with bfloat16 is slower than eager due to a CUDA graph capture issue (see Benchmarks). Use pytorch>=2.9 (Triton ≥3.5) for best results.
To strictly install the renderer:
pip install nanodrr
To install the optional plotting or 3D visualization module:
pip install "nanodrr[plot]" # 2D visualization (matplotlib, opencv)
pip install "nanodrr[scene]" # 3D visualization (VTK, PyVista)
pip install "nanodrr[all]" # All extras
``$
## \text{Benchmarks}
> [!\text{IMPORTANT}]
> - **~5 \times \text{faster}** \text{than} [$DiffDRR`](https://github.com/eigenvivek/DiffDRR) out of the box, without compilation (946 FPS vs 213 FPS)
> - **~8× faster** with `torch.compile` and `bfloat16` on `pytorch>=2.9` (1,650 FPS vs 213 FPS)
> - **~2.5× less memory** than `DiffDRR` (516 MB vs 1,344 MB peak reserved with `bfloat16` + compile)
<picture>
<source media="(prefers-color-scheme: dark)" srcset="docs/assets/images/benchmark_dark.png">
<source media="(prefers-color-scheme: light)" srcset="docs/assets/images/benchmark.png">
<img alt="Benchmarking runtime, FPS, and memory usage." src="tests/benchmark/benchmark.png">
</picture>
> *Mean ± std. dev. of 10 runs, 100 loops each. Benchmarked by rendering 200×200 DRRs on an NVIDIA RTX 6000 Ada (48 GB) with Python 3.12. Compile represents `torch.compile(mode="reduce-overhead", fullgraph=True)`. Full experiment at [`tests/benchmark/`](tests/benchmark/).*
## Docs
To test the docs locally, run
uv run --group docs jupyter nbconvert --to markdown tutorials/*.ipynb --output-dir docs/tutorials/ uv run --group docs zensical serve
## Roadmap
- [x] Implement a fully optimized renderer
- [x] Port strictly necessary modules from `DiffDRR` (e.g., SE(3) utilities, loss functions, and 2D plotting)
- [x] Migrate 3D plotting functions to an optional module
- [ ] Integrate with [`xvr`](https://github.com/eigenvivek/xvr) to speed up network training and registration
- [ ] Integrate with [`polypose`](https://github.com/eigenvivek/polypose) to speed up registration
- [ ] Release as `v1.0.0` of `DiffDRR`!