README.md

April 21, 2026 · View on GitHub

UniGame: Turning a Unified Multimodal Model Into Its Own Adversary

Overview | Installation | Quick Start | Citation

Please disregard this repo and refer to TorchUMM for a more unified implementation of ALL UMMs!

Overview

UniGame is the first self-adversarial post-training framework that improves the consistency between understanding and generation pathways in Unified Multimodal Models. By treating the generation pathway as an active adversary, UniGame enables the model to discover and correct its own inconsistencies.

Quantitative Results:

Installation

Requirements

Python >= 3.8
PyTorch >= 2.0
CUDA >= 11.8 (recommended)

Setup

# Clone the repository
git clone https://github.com/AIFrontierLab/UniGame.git
cd UniGame

# Create conda environment
conda create -n unigame python=3.11 -y
conda activate unigame

# Install dependencies
pip install -r requirements.txt

Quick Start

1. Prepare Dataset

Download the VQAv2 dataset and update the path in main.py:

LOCAL_VQAV2 = "/path/to/your/vqav2"

2. Training

Single GPU:

python main.py

Multi-GPU (DDP):

torchrun --nproc_per_node=4 main.py

SLURM Cluster:

srun --gres=gpu:4 --cpus-per-task=16 torchrun --nproc_per_node=4 main.py

Citation

If you find this work useful, please cite:

@inproceedings{Su2025UniGameTA,
  title={UniGame: Turning a Unified Multimodal Model Into Its Own Adversary},
  author={Zhaolong Su and Wang Lu and Hao Chen and Sharon Li and Jindong Wang},
  year={2025},
  url={https://api.semanticscholar.org/CorpusID:283244819}
}

Acknowledgements

We thank Dr. Ziyue Xu from NVIDIA for his insightful discussions and valuable comments on this project. We thank the authors of Janus-Pro, and other open-source projects that made this work possible.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

For questions or issues, please open an issue or contact:

Zhaolong Su: zsu05@wm.edu