README.md

February 12, 2026 · View on GitHub

COMI: Coarse-to-fine Context Compression via Marginal Information Gain

Jiwei Tang1 · Shilei Liu2 · Zhicheng Zhang1 · Yujin Yuan2 · Libin Zheng3,† · Wenbo Su2 · Bo Zheng2,†

1 Tsinghua University · 2 Future Living Lab of Alibaba · 3 Sun Yat-sen University· Corresponding Author

ICLR 2026

This is the official implementation for our ICLR 2026 paper "COMI: Coarse-to-fine Context Compression via Marginal Information Gain". Our work introduces a context compression method that jointly optimizes semantic relevance and diversity through Marginal Information Gain (MIG), enabling effective long-context processing under high compression rates (up to 32×) while eliminating redundant information.

Compression Process of COMI.

Training Process of COMI.

Release

  • [02/12] Initial Release. The models and code for training and inference are coming soon!

Motivation

Existing task-aware compression methods focus solely on relevance to the query, ignoring semantic redundancy among retained tokens—leading to accumulation of "relevant but redundant" content that misleads LLMs.

Top Query-Related Tokens Similarity.

We propose:

  • Marginal Information Gain (MIG): A novel metric defined as relevance to query minus semantic redundancy with other units, jointly optimizing information value and diversity
  • Coarse-to-Fine Compression Strategy:
    • Coarse-Grained Group Reallocation: Dynamically assigns compression rates across context segments based on inter-group MIG
    • Fine-Grained Token Merging: Fuses tokens within groups using intra-group MIG-weighted averaging to preserve key semantics while eliminating redundancy

Main Results

COMI achieves superiority performance across QA and summarization tasks under high compression rates:

BibTeX

If you find our repo helpful, please consider leaving a star and cite our paper

@misc{tang2026comicoarsetofinecontextcompression,
      title={COMI: Coarse-to-fine Context Compression via Marginal Information Gain}, 
      author={Jiwei Tang and Shilei Liu and Zhicheng Zhang and Yujin Yuan and Libin Zheng and Wenbo Su and Bo Zheng},
      year={2026},
      eprint={2602.01719},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2602.01719}, 
}