PolaFormer: Polarity-aware Linear Attention for Vision Transformers [ICLR 2025]
June 12, 2026 Β· View on GitHub
If you like our works, please support us with your starsβ!
π Welcome to the repo of PolaFormer!
This repo contains the official PyTorch code and pre-trained models for PolaFormer.
π₯ News
-
[2026.06] π₯ If you are interested in linear attention, you might also want to check out our new work NalaFormer, which has been accepted by ICML 2026. [paper] [code]
-
[2025.10] π₯ The next version model, PolaFormer++ has been released. We warmly welcome the community to use and explore it!
-
[2025.04] π₯ The triton implementation of PolaFormer is released thanks to fbi_la library
-
[0205.01] π₯ Our paper has been accepted by The International Conference on Learning Representations (ICLR), 2025.
Introduction
Motivation
Linear attention has emerged as a promising alternative to softmax-based attention, leveraging kernelized feature maps to reduce complexity from to in sequence length. However, the non-negative constraint on feature maps and the relaxed exponential function used in approximation lead to significant information loss compared to the original query-key dot products, resulting in less discriminative attention maps with higher entropy. To address the missing interactions driven by negative values in query-key pairs and the high entropy, we propose the PolaFormer, which achieves a superior balance between expressive capability and efficiency.
Method
In this paper, we propose the polarity-aware linear attention mechanism that explicitly models both same-signed and opposite-signed query-key interactions, ensuring comprehensive coverage of relational information. Furthermore, to restore the spiky properties of attention maps, we prove that the existence of a class of element-wise functions (with positive first and second derivatives) can reduce entropy in the attention distribution. Finally, we employ a learnable power function for rescaling, allowing strong and weak attention signals to be effectively separated.
Results
- Comparison of different models on ImageNet-1K.
- Performance on Long Range Arena benchmark.
| Model | Text | ListOps | Retrieval | Pathfinder | Image | Average |
|---|---|---|---|---|---|---|
| 73.06 | 37.35 | 80.50 | 70.53 | 42.15 | 60.72 | |
| 72.33 | 38.76 | 80.37 | 68.98 | 41.91 | 60.47 | |
| 71.93 | 37.60 | 81.47 | 69.09 | 42.77 | 60.57 |
Dependencies
- Python 3.9
- PyTorch == 1.11.0
- torchvision == 0.12.0
- numpy
- timm == 0.4.12
- einops
- yacs
Data preparation
The ImageNet dataset should be prepared as follows:
$ tree data
imagenet
βββ train
β βββ class1
β β βββ img1.jpeg
β β βββ img2.jpeg
β β βββ ...
β βββ class2
β β βββ img3.jpeg
β β βββ ...
β βββ ...
βββ val
βββ class1
β βββ img4.jpeg
β βββ img5.jpeg
β βββ ...
βββ class2
β βββ img6.jpeg
β βββ ...
βββ ...
Pretrained Models
Based on different model architectures, we provide several pretrained models, as listed below.
| model | Reso | acc@1 | config |
|---|---|---|---|
| Pola-PVT-T | $2$24^{2}$$ | 78.8 (+3.7) | config |
| Pola-PVT-S | $2$24^{2}$$ | 81.9 (+2.1) | config |
| Pola-Swin-T | $2$24^{2}$$ | 82.6 (+1.4) | config |
| Pola-Swin-S | $2$24^{2}$$ | 83.6 (+0.6) | config |
| Pola-Swin-B | $2$24^{2}$$ | 83.8 (+0.3) | config |
Evaluate one model on ImageNet:
python -m torch.distributed.launch --nproc_per_node=8 main.py --cfg <path-to-config-file> --data-path <imagenet-path> --output <output-path> --eval --resume <path-to-pretrained-weights>
Train Models from Scratch
- To train our model on ImageNet from scratch, see pretrain.sh and run:
bash pretrain.sh
Acknowledgements
This code is developed on the top of Swin Transformer and FLatten Transformer.
Citation
If you find this repo helpful, please consider citing us.
@inproceedings{
meng2025polaformer,
title={PolaFormer: Polarity-aware Linear Attention for Vision Transformers},
author={Weikang Meng and Yadan Luo and Xin Li and Dongmei Jiang and Zheng Zhang},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=kN6MFmKUSK}
}
Contact
If you have any questions, please feel free to contact the authors.
Weikang Meng: zacharymengwk@gmail.com