Link to full paper (PDF): https://arxiv.org/pdf/2505.22489
This repository implements a cascaded volumetric image synthesis pipeline that combines conditional diffusion and flow-based super-resolution modules to generate realistic whole-body ¹⁸F-FDG PET/CT scans from demographic variables (height, weight, sex, age). The approach proceeds from low-resolution diffusion-based generation to high-resolution refinement, enabling anatomy-preserving, demographically conditioned volumes. Comprehensive preprocessing, training, inference, and evaluation scripts facilitate reproduction and extension.
- Demographic conditioning on height, weight, sex, and age
- Two-stage generation: low-resolution diffusion → super-resolution flow
- Modular scripts for data handling, training, inference, and evaluation
- Quantitative metrics/statistics
0_Preprocess/ Data conversion, segmentation, and intensity scaling
1_Trainings/ Training scripts and configs for diffusion & flow modules
2_Tests/ Inference utilities and shared runtime libraries
3_Evaluations/ Evaluation metrics and statistical analysis
Resource/ Pipeline diagrams and result figures
- Python 3.11 or later
- PyTorch 2.0+ with CUDA support
- NumPy, SciPy, scikit-learn, scikit-image
- imageio, imageio-ffmpeg, pyspng, pillow, nibabel, click, requests, tqdm, psutil
- See Dockerfile under training folder

Figure 1. Overview of the cascaded diffusion and flow modules.

Figure 2. Dataset and generated cohort BMI distributions.

Figure 3. Performance comparison across model variants. Red: Female, and Blue: Male
@article{yoon2025cascaded,
title={Cascaded 3D Diffusion Models for Whole-body 3D 18-F FDG PET/CT synthesis from Demographics},
author={Yoon, Siyeop and Song, Sifan and Jin, Pengfei and Tivnan, Matthew and Oh, Yujin and Kim, Sekeun and Wu, Dufan and Li, Xiang and Li, Quanzheng},
journal={arXiv preprint arXiv:2505.22489},
year={2025}
}See the [LICENSE] files for under Training folder.