README.md

January 3, 2026 · View on GitHub

SpiLiFormer: Enhancing Spiking Transformers with Lateral Inhibition

Zeqi Zheng*, Yanchen Huang*, Yingchao Yu, Zizheng Zhu, Junfeng Tang, Zhaofei Yu, Yaochu Jin^†

* Equal contribution ^† Corresponding author

News

Jun. 2025 Our work has been accepted by ICCV 2025 🎉.
Jan. 2026 We release the code and model checkpoints! 🚀

SpiLiFormer (Spiking Transformer with Lateral Inhibition) is a novel brain-inspired spiking transformer architecture designed to enhance the performance and robustness of spiking neural networks (SNNs).

Inspired by the lateral inhibition mechanism in the human visual system, which helps the brain focus on salient regions by suppressing responses from neighboring neurons, SpiLiFormer introduces two new attention modules:

FF-LiDiff Attention (Feedforward-pathway Lateral Differential Inhibition): Inspired by short-range inhibition in the retina, this module reduces distraction in shallow network stages by differentially inhibiting attention responses.
FB-LiDiff Attention (Feedback-pathway Lateral Differential Inhibition): Inspired by long-range cortical inhibition, this module incorporates feedback to refine attention allocation in deeper network stages.

SpiLiFormer poster

Main Results on ImageNet-1K

Methods	Type	Architecture	Input Size	Param (M)	Power (mJ)	Time Step	Top-1 Acc (%)	Download
ViT	ANN	ViT-B/16	384	86.59	254.84	1	77.90	-
Swin Transformer	ANN	Swin Transformer-B	384	87.77	216.20	1	84.50	-
Spikformer	SNN	Spikformer-8-768	224	66.34	21.48	4	74.81	-
QKFormer	SNN	HST-10-768	384	64.96	113.64	4	85.65	-
E-SpikeFormer	SNN	E-SpikeFormer	384	173.0	-	8	86.2	-
SpiLiFormer (Ours)	SNN	SpiLiFormer-10-768	224	69.10	11.77	1	81.54	link
SpiLiFormer (Ours)	SNN	SpiLiFormer-10-768	224	69.10	44.17	4	85.82	link
SpiLiFormer (Ours)	SNN	SpiLiFormer-10-768*	288	69.10	73.52	4	86.62	link
SpiLiFormer (Ours)	SNN	SpiLiFormer-10-768**	384	69.10	129.45	4	86.66	link

SpiLiFormer demonstrates performance superior to current State-of-the-Art (SOTA) SNN models and even some ANN models on ImageNet-1K, while maintaining lower energy consumption and parameter counts.

Main Results on Other Datasets (CIFAR & Neuromorphic)

Datasets	Methods	Architecture	Param (M)	Time Step	Top-1 Acc (%)
CIFAR-10	SpiLiFormer (Ours)	SpiLiFormer-4-384	7.04	4	96.63
	QKFormer	HST-4-384	6.74	4	96.18
	Spikformer	Spikformer-4-384	9.32	4	95.51
CIFAR-100	SpiLiFormer (Ours)	SpiLiFormer-4-384	7.04	4	81.63
	QKFormer	HST-4-384	6.74	4	81.15
	Spikingformer	Spikingformer-4-384	9.32	4	79.21
CIFAR10-DVS	SpiLiFormer (Ours)	SpiLiFormer-2-256	1.57	16	86.7
	QKFormer	HST-2-256	1.50	16	84.0
	Spikformer	Spikformer-2-256	2.57	16	80.9
N-Caltech101	SpiLiFormer (Ours)	SpiLiFormer-2-256	1.57	16	89.18
	QKFormer	HST-2-256	1.50	16	87.24
	S-Transformer	S-Transformer-2-256	2.57	16	86.3

SpiLiFormer also achieves SOTA performance on static image datasets (CIFAR-10/100) and neuromorphic datasets (CIFAR10-DVS/N-Caltech101)

Quick Start

Requirements

timm==0.6.12
cupy==11.4.0
torch==1.12.1
spikingjelly==0.0.0.0.12
pyyaml
tensorboard

Data Preparation

ImageNet-1K (ILSVRC 2012): https://image-net.org/download.php
CIFAR-10: https://www.cs.toronto.edu/~kriz/cifar.html
CIFAR-100: https://www.cs.toronto.edu/~kriz/cifar.html
CIFAR10-DVS: https://figshare.com/articles/dataset/CIFAR10-DVS_New/4724671
N-Caltech101: https://data.mendeley.com/datasets/cy6cvx3ryv/1

Train on CIFAR-10

CUDA_VISIBLE_DEVICES=0 python ./cifar10/train.py \
--output ./cifar10/outputs \
--config ./cifar10/cifar10.yml \
-data-dir /your_cifar_10_dataset_filepath \
-T 4

Train on CIFAR-100

CUDA_VISIBLE_DEVICES=0 python ./cifar100/train.py \
--output ./cifar100/outputs/ \
--config ./cifar100/cifar100.yml \
-data-dir /your_cifar_100_dataset_filepath \
-T 4

Train on CIFAR10-DVS

CUDA_VISIBLE_DEVICES=0 python ./cifar10dvs/train.py \
--output-dir ./cifar10dvs/outputs/ \
--data-path /your_cifar_10_dvs_dataset_filepath \
--T 16

Train on N-Caltech101

CUDA_VISIBLE_DEVICES=0 python ./ncaltech101/train.py \
--output-dir ./ncaltech101/outputs/ \
--data-path /your_ncaltech101_dataset_filepath \
--dts_cache /your_ncaltech101_dataset_filepath/dts_cache \
--T 16

Evaluation on ImageNet-1K

SpiLiFormer-10-768, T=1, Input_size=224

CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 ./imagetnet_1k/train.py \
--output_dir ./imagetnet_1k/ouputs/ \
--log_dir ./imagetnet_1k/ouputs/ \
--data_path /your_imagenet_1k_dataset_filepath \
--model SpiLiFormer_10_768 \
--input_size 224 \
--time_step 1 \
--batch_size 64 \
--accum_iter 1 \
--resume ./your_checkpoint_filepath/spiliformer_7_768_T_1_224.pth \
--eval

SpiLiFormer-10-768, T=4, Input_size=224

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --nproc_per_node=8 ./imagetnet_1k/train.py \
--output_dir ./imagetnet_1k/ouputs/ \
--log_dir ./imagetnet_1k/ouputs/ \
--data_path /your_imagenet_1k_dataset_filepath \
--model SpiLiFormer_10_768 \
--input_size 224 \
--time_step 4 \
--batch_size 64 \
--accum_iter 1 \
--resume ./your_checkpoint_filepath/spiliformer_7_768_T_4_224.pth \
--eval

SpiLiFormer-10-768, T=4, Input_size=288

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --nproc_per_node=8 ./imagetnet_1k/train.py \
--output_dir ./imagetnet_1k/ouputs/ \
--log_dir ./imagetnet_1k/ouputs/ \
--data_path /your_imagenet_1k_dataset_filepath \
--model SpiLiFormer_10_768 \
--input_size 288 \
--time_step 4 \
--batch_size 64 \
--accum_iter 1 \
--resume ./your_checkpoint_filepath/spiliformer_7_768_T_4_288.pth \
--eval

SpiLiFormer-10-768, T=4, Input_size=384

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --nproc_per_node=8 ./imagetnet_1k/train.py \
--output_dir ./imagetnet_1k/ouputs/ \
--log_dir ./imagetnet_1k/ouputs/ \
--data_path /your_imagenet_1k_dataset_filepath \
--model SpiLiFormer_10_768 \
--input_size 384 \
--time_step 4 \
--batch_size 64 \
--accum_iter 1 \
--resume ./your_checkpoint_filepath/spiliformer_7_768_T_4_384.pth \
--eval

Citation

If you use our code or data in this repo or find our work helpful, please consider giving a citation:

@inproceedings{zheng2025spiliformer,
  title={SpiLiFormer: Enhancing Spiking Transformers with Lateral Inhibition},
  author={Zheng, Zeqi and Huang, Yanchen and Yu, Yingchao and Zhu, Zizheng and Tang, Junfeng and Yu, Zhaofei and Jin, Yaochu},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={24539--24548},
  year={2025}
}