Diffusion Transformer Meets Random Masks: An Advanced PET Reconstruction Framework

September 23, 2025 ยท View on GitHub

Author: Bin Huang, Binzhong He, Yanhan Chen, Zhili Liu, Xinyue Wang, Binxuan Li, Qiegen Liu

Deep learning has significantly advanced PET image reconstruction, achieving remarkable improvements in image quality through direct training on sinogram or image data. Traditional methods often utilize masks for inpainting tasks, but their incorporation into PET reconstruction frameworks introduces transformative potential. In this study, we propose an advanced PET reconstruction framework called Diffusion tRansformer mEets rAndom Masks (DREAM). To the best of our knowledge, this is the first work to integrate mask mechanisms into both the sinogram domain and the latent space, pioneering their role in PET reconstruction and demonstrating their ability to enhance reconstruction fidelity and efficiency. The framework employs a high-dimensional stacking approach, transforming masked data from two to three dimensions to expand the solution space and enable the model to capture richer spatial relationships. Additionally, a mask-driven latent space is designed to accelerate the diffusion process by leveraging sinogram-driven and mask-driven compact priors, which reduce computational complexity while preserving essential data characteristics. A hierarchical masking strategy is also introduced, guiding the model from focusing on fine-grained local details in the early stages to capturing broader global patterns over time. This progressive approach ensures a balance between detailed feature preservation and comprehensive context understanding. Experimental results demonstrate that DREAM not only improves the overall quality of reconstructed PET images but also preserves critical clinical details, highlighting its potential to advance PET imaging technology. By integrating compact priors and hierarchical masking, DREAM offers a promising and efficient avenue for future research and application in PET imaging.

DREAM framework

Comparison of the traditional PET reconstruction method and the proposed DREAM framework. DREAM employs a hierarchical mask algorithm to create 3D sinogram data blocks with richer spatial relationships.

DREAM training procedure

The pipeline of DREAM training procedure. DREAM mainly consists by mask-driven latent space, diffusion stage and transformer stage. The random masks and hierarchical masks are combined to form the sinogram data blocks. SMCP of sinogram data blocks will be feed into the diffusion stage to predict and guide transformer stage to reconstruct final result.

DREAM reconstruction procedure

The pipeline of DREAM reconstruction procedure. Injecting random masks to (N-1) noisy sinograms and stacking with a single noisy sinogram to form a noisy sinogram data block. SMCP of noise-free sinogram data block will be reconstructed through diffusion stage. Noise-free sinogram data block is reconstructed under the guidance of the SMCP and subsequently combined using weighted averaging to generate final PET sinogram. This combined PET sinogram is then processed through the MLEM algorithm to produce the final PET image.

Reconstruction results for PET images using different methods

Reconstruction results for PET images using different methods. (a)-(f) show the reconstruction results and residual maps for various comparison methods and DREAM. (g) presents the noise-free reconstruction. The second row depicts the residuals between the reference and reconstructed images.

Reconstruction results for PET sinograms using different methods

Reconstruction results for PET sinograms using different methods. (a)-(f) show the sinogram data generated by various comparison methods and DREAM. (g) presents the noise-free sinogram. The second row depicts the residuals between the reference and reconstructed images.

Training

To pretrain DiffIR_S1, run

sh trainS1.sh

To train DiffIR_S2, run

#set the 'pretrain_network_g' and 'pretrain_network_S1' in ./options/train_DiffIRS2_sino_nomin.yml to be the path of DiffIR_S1's pre-trained model

sh trainS2.sh

Evaluation

  • Testing
# modify the dataset path in ./options/test_DiffIRS2.yml

sh test.sh 
  • ALL-PET: A Low-resource and Low-shot PET Foundation Model in Projection Domain [Paper] [Code]

  • Diffusion Transformer Model with Compact Prior for Low-dose PET Reconstruction [Paper] [Code]

  • RED: Residual Estimation Diffusion for Low-Dose PET Sinogram Reconstruction [Paper] [Code]

  • Double-Constraint Diffusion Model with Nuclear Regularization for Ultra-low-dose PET Reconstruction [Paper] [Code]

  • Raysolution_PET_Data [Data]

  • Temporal Image Sequence Separation in Dual-tracer Dynamic PET with an Invertible Network [Paper] [Code]

  • Spatial-Temporal Guided Diffusion Transformer Probabilistic Model for Delayed Scan PET Image Prediction [Paper] [Code]

  • PET Tracer Separation using Conditional Diffusion Transformer with Multi-latent Space Learning [Paper]

  • Synthetic CT Generation via Variant Invertible Network for Brain PET Attenuation Correction [Paper] [Code]

  • A Prior-Guided Joint Diffusion Model in Projection Domain for PET Tracer Conversion [Paper] [Code]

  • Positron Emission Tomography Tracer Conversion via Variable Augmented Invertible Network [Paper]