README.md
November 25, 2024 ยท View on GitHub
PhyDiff: Towards Realistic Physical Transformations in Text-to-Image Diffusion Models
Fan Wu1, Cheng Chen1, Zhoujie Fu1, Xulei Yang2, Yi Xu3, Guosheng Lin1โ
1Nanyang Technological University, 2A*STAR, 3OPPO US Research Center, โ Corresponding author
๐ Abstract
In this paper, we address a realistic task, physical transformations image generation, where we aim to freely combine physical concepts on open-world objects to generate natural and meaningful images. We propose PhyDiff, a diffusion-based model fine-tuning framework using a few images and corresponding text prompts as inputs to perform realistic and meaningful physical transformations on open-world objects. PhyDiff comprises two novel regularization loss functions. One is concept decouple loss, which helps to decouple the mixture of independent features from multiple input concept data, ensuring the diffusion model learns the representations, respectively. The other is isometric loss, which helps to extract the invariant features existing in the cross-object physical concept data.
๐ Run
-
follow diffusers for installation
-
run the fine-tuning code
python train.py
- run the inference code
python inference.py
๐ Results of PhyDiff
๐๏ธ TODO
- Release code
- Release datasets
๐๏ธ BibTeX
If you find this project useful in your research, please consider cite:
@article{wu2024phydiff,
title={PhyDiff: Towards Realistic Physical Transformations in Text-to-Image Diffusion Models},
author={Wu, Fan and Chen, Cheng and Fu, Zhoujie and Yang, Xulei and Xu, Yi and Lin, Guosheng},
journal={},
year={2024}
}
๐ Acknowledgements
We thank to Stable Diffusion for the releasing models and codes, FreeCustom for the project page.
๐ง Contact
If you have any technical comments or questions, please open a new issue or feel free to contact Guosheng Lin