README.md

February 28, 2026 ยท View on GitHub

[EMNLP Findings 2022] SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters

arXiv EMNLP Findings 2022 PyTorch 1.13.1 Transformers 4.17.0

Shwai He, Liang Ding, Daize Dong, Miao Zhang, Dacheng Tao

๐Ÿ“– Introduction โ€ข ๐Ÿ“ฐ News โ€ข โœจ Why โ€ข ๐Ÿ” Setting โ€ข ๐Ÿš€ Quick Start โ€ข โš™๏ธ Installation โ€ข ๐Ÿ“„ Citation

๐Ÿ“– Introduction

This is the official implementation of the paper SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters, published in Findings of EMNLP 2022.

SparseAdapter overview

๐Ÿ“ฐ News

  • Oct 2022: SparseAdapter accepted by Findings of EMNLP 2022.
  • Oct 2022: Paper and official implementation released.

โœจ Why SparseAdapter

Adapter tuning is efficient because it freezes most PLM parameters, but strong adapter performance often requires larger adapter modules. SparseAdapter revisits this tradeoff with pruning:

  • Keep adapter-style tuning workflow.
  • Introduce sparsity into adapter parameters.
  • Improve parameter-efficiency while staying competitive with dense adapters.

The paper also introduces Large-Sparse, which increases adapter capacity under the same parameter budget by combining larger adapters with sparsity.

๐Ÿ” Core Setting

SparseAdapter training in this repo is mainly controlled by:

  • --pruner: pruning strategy (e.g., rand, snip).
  • --sparsity: sparsity ratio.
  • --attn_bn, --ffn_bn: adapter bottleneck sizes.
  • --attn_mode, --ffn_mode: adapter insertion modes.

These options are exposed in task scripts under examples/pytorch/.

๐Ÿ“ฆ Repository Structure

  • src/: modified Transformers source code and sparse trainer logic.
  • petl/: adapter-related modules and utilities.
  • examples/pytorch/: runnable scripts for GLUE, SQuAD, and summarization.
  • Figures/: figures used in the paper and README.
  • utils/: project utility scripts.

โš™๏ธ Installation

  • Python 3.8+
  • torch==1.13.1
  • transformers==4.17.0
  • tokenizers==0.10.1
  • nltk==3.5
conda create -n sparseadapter python=3.8 -y
conda activate sparseadapter

pip install -r requirements.txt

๐Ÿš€ Quick Start

1) Text Classification (GLUE)

cd examples/pytorch/text-classification
bash run_glue.sh

Main script: examples/pytorch/text-classification/run_glue_sparse.py

2) Question Answering (SQuAD)

cd examples/pytorch/question-answering
bash run_qa.sh

Main script: examples/pytorch/question-answering/run_qa_sparse.py

3) Summarization (XSum/CNN-DM)

cd examples/pytorch/summarization
bash run_summarization.sh

Main script: examples/pytorch/summarization/run_summarization_sparse.py

๐Ÿงช Repro Tips

  • Default .sh scripts are configured for multi-GPU runs (device_ids="0 1 2 3 4 5 6 7"). Adjust to your hardware.
  • Outputs are written under checkpoints/.
  • Start from each task script first, then tune pruning and sparsity knobs for your setup.

๐Ÿ“„ Citation

@inproceedings{he2022sparseadapter,
  title = {SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters},
  author = {He, Shwai and Ding, Liang and Dong, Daize and Zhang, Miao and Tao, Dacheng},
  booktitle = {Findings of EMNLP},
  year = {2022},
  url = {https://aclanthology.org/2022.findings-emnlp.160}
}