HIMO: Cross-Arbitrary-Modality Image Invariant Feature Transform with Hierarchical Intrinsic Major Orientation

June 1, 2026 · View on GitHub

Paper Link: https://ieeexplore.ieee.org/document/11435911/

📈 Matching Performance

A new image matching method of traditional handcrafted framework with the following effects: (2025.03.20)

Affine (rotation + scaling) distorsion:

One-stage	Two-stage

Projective / Perspective / Homography distorsion:

One-stage	Two-stage

General Cross-Modal Image Matching:

Datasets Matching Performance:

Dense-like Matching Performance:

📦 Datasets Release

** The author is now busy with graduation, causing a delay of open-source work. The datasets will be available soon.

*** Full GCZ dataset ***

Google Drive: https://drive.google.com/drive/folders/1yZo3ZPxVuUrHbXJwNKEMEuBSVmOVihzM?usp=sharing

Baidu Netdisk: https://pan.baidu.com/s/10d-xgjO15qu9sjRZVanJAQ?pwd=dgcz

*** Full WDS dataset ***

Google Drive: https://drive.google.com/drive/folders/1wxwu0ZAR2a0HA9rEfC11guZsw3cksYZu?usp=sharing

Baidu Netdisk: https://pan.baidu.com/s/1n6GrHdKKTSMmTXaMe88IBg?pwd=dwds

*** Revised MRSI^[1-2] dataset labels ***

Google Drive: https://drive.google.com/file/d/1joFkfeCJnGtkL9zZp2FSR0ZTr0wBP5zj/view?usp=sharing

Baidu Netdisk: https://pan.baidu.com/s/1LwqD7OBxQDFpxatWIO56Rw?pwd=mrsi

*** Revised SRIF^[3] dataset labels ***

Google Drive: https://drive.google.com/file/d/1ODZDtKcN_7KkvXk4XOQhSOqCVWGK0tp4/view?usp=sharing

Baidu Netdisk: https://pan.baidu.com/s/14GsQMeiYV_8kTP5CZiSX1w?pwd=srif

[1] J. Li, Q. Hu, and M. Ai, “RIFT: Multi-modal image matching based on radiation-variation insensitive feature transform,” IEEE Transactions on Image Processing, vol. 29, pp. 3296–3310, 2019.

[2] Y. Yao, Y. Zhang, Y. Wan, X. Liu, X. Yan, and J. Li, “Multi-modal remote sensing image matching considering co-occurrence filter,” IEEE Transactions on Image Processing, vol. 31, pp. 2584–2597, 2022.

[3] J. Li, Q. Hu, and Y. Zhang, “Multimodal image matching: A scale invariant algorithm and an open dataset,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 204, pp. 77–88, 2023.

📚 Citation

If you find our work useful in your research, please consider citing:

@article{gao2026himo,
  author={Gao, Chenzhong and Li, Wei and Weng, Desheng and Tao, Ran and Xia, Xiang-Gen and Du, Qian},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={{HIMO}: Cross-Arbitrary-Modality Image Invariant Feature Transform with Hierarchical Intrinsic Major Orientation}, 
  year={2026},
  pages={1-18},
  publisher={IEEE}
}

❓ Open Questions & Issues

Here we discover several remaining problems & issues in HIMO that could be focused on.

The keypoint detector based on phase congruency produces strong edge effects in images. The current approach is to directly set the keypoint response values to zero within a preset range near the image edges, which is rather inelegant. First, it is difficult to determine the exact extent of the edge effect. Second, this results in wasting potentially useful texture feature locations at the edges. Is there any good solutions available at present?
The multi-scale strategy in HIMO still essentially relies on an exhaustive scale-space approach to accommodate scale differences between images, making it difficult to adaptively derive scale ratios as SIFT does to achieve true “scale-invariance”. This is because cross-modal images do not follow a unified degradation model, which appears to be a challenging problem to solve at present.