CoMAP: Co-Evolving World Models and Agent Policies for LLM Agents
June 3, 2026 · View on GitHub
Youwei Liu, Jian Wang†, Hanlin Wang, Wenjie Li
Overview
Figure 1. Conceptual illustration of CoMAP.
CoMAP co-evolves the world model and the agent policy in a closed loop. The world model provides lookahead states for policy improvement, while the agent policy generates on-policy interactions for world-model updating.
Figure 2. Framework overview.
At each step, the agent first drafts an action, the world model predicts its future state, and the policy performs future-aware reflection to refine the action. The resulting trajectories are used for on-policy self-distillation of the world model and policy-side evolution.
Citation
If you find this work helpful, please consider citing:
@article{liu2026comap,
title = {Co-Evolving World Models and Agent Policies for LLM Agents},
author = {Liu, Youwei and Wang, Jian and Wang, Hanlin and Li, Wenjie},
journal = {arXiv preprint arXiv:2606.02372},
year = {2026},
url = {https://arxiv.org/abs/2606.02372}
}
Contact
For questions, please contact:
loyiv5477@gmail.com