Benchmark Sample Scripts

April 19, 2026 ยท View on GitHub

Important note on paper vs open-source results

The model used in the paper was trained with a 512-focused setup.
The released open-source model is a generalized checkpoint across 512-1024 resolutions.
Because of this, benchmark numbers from the open-source release are expected to differ from the paper tables.

We reran all three benchmarks with the released model. Results are listed below.

Re-tested results (released model)

LAMICBench++

Fewer Instances:

ITCAESIDSIPSAVG
58.5191.5640.1682.3268.14

More Instances:

ITCAESIDSIPSAVG
59.7383.0438.0276.3464.28

COCO-MIG

SRI-SRmIoUG-CL-C
28.0064.9160.2525.8221.59

LayoutSAM

SpatialColorTextureShapeCLIPPick
94.1386.4188.1187.5727.7522.88

Reproduction entry points

If your goal is fast reproduction, start from these three READMEs.
Each one begins with the shortest path: unpack the provided tar.gz and run the script with default args.