Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks?

January 28, 2026 Β· View on GitHub

Ethical Considerations β€’ Quick Start β€’ Results β€’ Citation

arXiv

πŸ“’ Update (January 2026): Our paper has been accepted to ICASSP 2026! πŸŽ‰

Overview

This is the official implementation of the paper:

Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks?

Junjie Mu*, Zonghao Ying*, Zhekui Fan, Zonglei Jing, Yaoyuan Zhang, Zhengmin Yu, Wenxin Zhang, Quanchen Zou†, Xiangzheng Zhang†

Work done at 360 AI Security Lab

TL;DR: We propose Mask-GCG, a plug-and-play method that employs learnable token masking to identify impactful tokens within adversarial suffixes. Our approach reveals that most tokens contribute significantly to attack success while a minority exhibit redundancy. By pruning low-impact tokens, we achieve up to 7.5% suffix compression, 16.8% time reduction, and 24% perplexity reduction without compromising attack success rate.

Overview

Ethical Considerations

Due to the sensitive nature of adversarial attack research, we do not distribute harmful content datasets directly. Please download the required data from the original sources:

DatasetSourceDescription
AdvBenchHuggingFaceHarmful behavior dataset
AmpleGCG DatasetsHuggingFaceGenerated suffixes produced by AmpleGCG
I-GCG DatasetsGithubI-GCG initialization suffix

Quick Start

1. Configuration

Edit config.py to set your model path:

MODEL_PATH = "/path/to/your/model"  # e.g., "/models/Llama-2-7b-chat-hf"
TEMPLATE_NAME = 'llama-2'  # Options: 'llama-2', 'vicuna_v1.1'

2. Single Attack

python run_attack.py --mode single

3. Batch Attack

python run_attack.py --mode batch --data_path /path/to/your/data --num_samples 50

4. Key Parameters

ParameterDescriptionDefault
LAMBDA_REGSparsity regularization strength0.3
INITIAL_LRMask optimizer learning rate0.05
PRUNING_THRESHOLDMask probability threshold for pruning0.3
ATTENTION_GUIDANCE_ENABLEDEnable attention-guided initializationTrue
SMART_PRUNING_ENABLEDEnable smart pruning strategyTrue

5. Adjusting Pruning Aggressiveness

# More aggressive pruning
LAMBDA_REG = 0.5
PRUNING_THRESHOLD = 0.4

# More conservative pruning
LAMBDA_REG = 0.2
PRUNING_THRESHOLD = 0.2

Results

Suffix Compression Ratio (SCR)

ModelGCG+Mask (L=20)GCG+Mask (L=30)AmpleGCG+Mask (L=20)AmpleGCG+Mask (L=30)
Llama-2-7b5.8%9.9%2.0%1.7%
Vicuna-7b1.4%2.1%6.5%4.1%
Llama-2-13b5.2%10.5%5.1%4.7%
Average4.1%7.5%4.5%3.5%

Attack Success Rate (ASR) (Suffix Length=30)

ModelGCG+MaskAmpleGCG+Mask
Llama-2-7b64%62%98%98%
Vicuna-7b100%96%100%100%
Llama-2-13b80%76%100%98%
Average81%78%99%99%

Time Reduction (seconds)

ModelGCGMask-GCGReduction
Llama-2-7b1285.6819.3-36.3%
Vicuna-7b117.0116.1-0.8%
Llama-2-13b1960.21856.2-5.3%
Average1120.9930.5-17.0%

For detailed experimental results, please refer to our paper.

Project Structure

Mask-GCG-main/
β”œβ”€β”€ config.py              # Configuration file
β”œβ”€β”€ run_attack.py          # Main entry point
β”œβ”€β”€ mask_gcg_utils.py      # Core Mask-GCG utilities
β”œβ”€β”€ requirements.txt       # Dependencies
β”œβ”€β”€ data/
└── llm_attacks/           # Base attack library
    β”œβ”€β”€ base/
    β”‚   └── attack_manager.py
    └── minimal_gcg/
        └── opt_utils.py

Citation

If you find this work useful, please cite our paper:

@inproceedings{mu2026maskgcg,
  title={Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks?},
  author={Mu, Junjie and Ying, Zonghao and Fan, Zhekui and Jing, Zonglei and Zhang, Yaoyuan and Yu, Zhengmin and Zhang, Wenxin and Zou, Quanchen and Zhang, Xiangzheng},
  booktitle={IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  organization={IEEE},
  year={2026}
}