README.md

November 25, 2024 · View on GitHub

PhyDiff: Towards Realistic Physical Transformations in Text-to-Image Diffusion Models

Fan Wu¹, Cheng Chen¹, Zhoujie Fu¹, Xulei Yang², Yi Xu³, Guosheng Lin¹^†

¹Nanyang Technological University, ²A*STAR, ³OPPO US Research Center, ^†Corresponding author

📖 Abstract

In this paper, we address a realistic task, physical transformations image generation, where we aim to freely combine physical concepts on open-world objects to generate natural and meaningful images. We propose PhyDiff, a diffusion-based model fine-tuning framework using a few images and corresponding text prompts as inputs to perform realistic and meaningful physical transformations on open-world objects. PhyDiff comprises two novel regularization loss functions. One is concept decouple loss, which helps to decouple the mixture of independent features from multiple input concept data, ensuring the diffusion model learns the representations, respectively. The other is isometric loss, which helps to extract the invariant features existing in the cross-object physical concept data.

🚀 Run

follow diffusers for installation
run the fine-tuning code

python train.py

run the inference code

python inference.py

🌄 Results of PhyDiff

🗓️ TODO

Release code
Release datasets

🖊️ BibTeX

If you find this project useful in your research, please consider cite:

@article{wu2024phydiff,
  title={PhyDiff: Towards Realistic Physical Transformations in Text-to-Image Diffusion Models},
  author={Wu, Fan and Chen, Cheng and Fu, Zhoujie and Yang, Xulei and Xu, Yi and Lin, Guosheng},
  journal={},
  year={2024}
}

🙏 Acknowledgements

We thank to Stable Diffusion for the releasing models and codes, FreeCustom for the project page.

📧 Contact

If you have any technical comments or questions, please open a new issue or feel free to contact Guosheng Lin