README.md

November 25, 2024 ยท View on GitHub

PhyDiff: Towards Realistic Physical Transformations in Text-to-Image Diffusion Models

Fan Wu1, Cheng Chen1, Zhoujie Fu1, Xulei Yang2, Yi Xu3, Guosheng Lin1โ€ 

1Nanyang Technological University, 2A*STAR, 3OPPO US Research Center, โ€ Corresponding author

Paper PDF Project Page

๐Ÿ“– Abstract

main_flow

In this paper, we address a realistic task, physical transformations image generation, where we aim to freely combine physical concepts on open-world objects to generate natural and meaningful images. We propose PhyDiff, a diffusion-based model fine-tuning framework using a few images and corresponding text prompts as inputs to perform realistic and meaningful physical transformations on open-world objects. PhyDiff comprises two novel regularization loss functions. One is concept decouple loss, which helps to decouple the mixture of independent features from multiple input concept data, ensuring the diffusion model learns the representations, respectively. The other is isometric loss, which helps to extract the invariant features existing in the cross-object physical concept data.

๐Ÿš€ Run

  1. follow diffusers for installation

  2. run the fine-tuning code

python train.py
  1. run the inference code
python inference.py

๐ŸŒ„ Results of PhyDiff

results_of_phydiff

๐Ÿ—“๏ธ TODO

  • Release code
  • Release datasets

๐Ÿ–Š๏ธ BibTeX

If you find this project useful in your research, please consider cite:

@article{wu2024phydiff,
  title={PhyDiff: Towards Realistic Physical Transformations in Text-to-Image Diffusion Models},
  author={Wu, Fan and Chen, Cheng and Fu, Zhoujie and Yang, Xulei and Xu, Yi and Lin, Guosheng},
  journal={},
  year={2024}
}

๐Ÿ™ Acknowledgements

We thank to Stable Diffusion for the releasing models and codes, FreeCustom for the project page.

๐Ÿ“ง Contact

If you have any technical comments or questions, please open a new issue or feel free to contact Guosheng Lin