README.md
February 28, 2026 ยท View on GitHub
[EMNLP Findings 2022] SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters
Shwai He, Liang Ding, Daize Dong, Miao Zhang, Dacheng Tao
๐ Introduction โข ๐ฐ News โข โจ Why โข ๐ Setting โข ๐ Quick Start โข โ๏ธ Installation โข ๐ Citation
๐ Introduction
This is the official implementation of the paper SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters, published in Findings of EMNLP 2022.
๐ฐ News
- Oct 2022: SparseAdapter accepted by Findings of EMNLP 2022.
- Oct 2022: Paper and official implementation released.
โจ Why SparseAdapter
Adapter tuning is efficient because it freezes most PLM parameters, but strong adapter performance often requires larger adapter modules. SparseAdapter revisits this tradeoff with pruning:
- Keep adapter-style tuning workflow.
- Introduce sparsity into adapter parameters.
- Improve parameter-efficiency while staying competitive with dense adapters.
The paper also introduces Large-Sparse, which increases adapter capacity under the same parameter budget by combining larger adapters with sparsity.
๐ Core Setting
SparseAdapter training in this repo is mainly controlled by:
--pruner: pruning strategy (e.g.,rand,snip).--sparsity: sparsity ratio.--attn_bn,--ffn_bn: adapter bottleneck sizes.--attn_mode,--ffn_mode: adapter insertion modes.
These options are exposed in task scripts under examples/pytorch/.
๐ฆ Repository Structure
src/: modified Transformers source code and sparse trainer logic.petl/: adapter-related modules and utilities.examples/pytorch/: runnable scripts for GLUE, SQuAD, and summarization.Figures/: figures used in the paper and README.utils/: project utility scripts.
โ๏ธ Installation
- Python 3.8+
- torch==1.13.1
- transformers==4.17.0
- tokenizers==0.10.1
- nltk==3.5
conda create -n sparseadapter python=3.8 -y
conda activate sparseadapter
pip install -r requirements.txt
๐ Quick Start
1) Text Classification (GLUE)
cd examples/pytorch/text-classification
bash run_glue.sh
Main script: examples/pytorch/text-classification/run_glue_sparse.py
2) Question Answering (SQuAD)
cd examples/pytorch/question-answering
bash run_qa.sh
Main script: examples/pytorch/question-answering/run_qa_sparse.py
3) Summarization (XSum/CNN-DM)
cd examples/pytorch/summarization
bash run_summarization.sh
Main script: examples/pytorch/summarization/run_summarization_sparse.py
๐งช Repro Tips
- Default
.shscripts are configured for multi-GPU runs (device_ids="0 1 2 3 4 5 6 7"). Adjust to your hardware. - Outputs are written under
checkpoints/. - Start from each task script first, then tune pruning and sparsity knobs for your setup.
๐ Citation
@inproceedings{he2022sparseadapter,
title = {SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters},
author = {He, Shwai and Ding, Liang and Dong, Daize and Zhang, Miao and Tao, Dacheng},
booktitle = {Findings of EMNLP},
year = {2022},
url = {https://aclanthology.org/2022.findings-emnlp.160}
}