Jailbreaking the Non-Transferable Barrier via Test-Time Data Disguising (JailNTL)

March 20, 2026 · View on GitHub

This repository is the official implementation of our CVPR 2025 paper, "Jailbreaking the Non-Transferable Barrier via Test-Time Data Disguising".

Quickstart

1. Installation

Clone the repository and install dependencies (under Python 3.10.12):

git clone https://github.com/tmllab/2025_CVPR_JailNTL
cd JailNTL
pip install -r requirements.txt

2. Preparing Data

We currently support CIFAR/STL, VisDA (VisDA-T, VisDA-V).

Download datasets:

mkdir ./data/
python src/data_download.py --data_dir ./data/

Pre-split the dataset into training/validation/testing sets:

python data_split.py

This command will create and split datasets to the ./data_presplit folder.

✨ We will provide pre-split demo datasets in Google Drive. You can download them and save to ./data_presplit/.

3. Training NTL

You can pre-train NTL models from scratch by running:

python src/ntl_pretrain.py -s <authorized-domain> -t <unauthorized-domain>

✨ We also provide model files in Google Drive which were pretrained on our demo pre-split datasets. You can save them to ./saved_models/.

💡 We use wandb to organize experiments and record resutls. Config files for training NTL are stored in ./config/pretrain/<domain-pair>.yml.Important Args are illustrated as belows:

task_name: {tNTL/tCUTI} for pretraining by using different NTL methods on the source & target domains.
data_pre_split: set to True for use the pre-split data.
data_transform: set to ntl to follow the image transformation in NTL.
NTL_network:
- VGG: vgg13, vgg19, vgg13bn, vgg19bn
- ResNet: resnet34cmi, wide_resnet50_2cmi
NTL_pretrain: whether use the ImageNet-1K pretrained weight for initialization.
NTL_train: whether train the NTL model (set to True for pretraining or False for loading the pretrained model).
NTL_epochs: pretraining epoch.
NTL_lr: pretrain learning rate.

4. Attack NTL Models

Please run the src/jailntl.py to evaluate the performance of JailNTL against NTL method. You can select the authorized and unauthorized domains by setting the -s and -t arguments.

python src/jailntl.py -s <authorized-domain> -t <unauthorized-domain>

💡 Config files for JailNTL are stored in ./config/attack/<domain-pair>.yml. Important Args are illustrated as belows:

jailntl_shot_num: number of authorized samples used for training the disguising model.
class_balance_weight: weight of the class balance loss in the disguising model.
confidence_weight: weight of the confidence loss in the disguising model.
grad_epsilon: gradient epsilon for finit difference when computing model-guided loss.
GAN_structure: for ablation study to enable/disable GAN with feedback (Eq. 3) and bidirectional GAN (Eq. 7), set to [True, True] by default to get the full JailNTL model.
n_epochs, n_epochs_decay: epoch settings for training the disguising model.

The configuration file contains all parameter settings required to reproduce the reported results. In addition, these parameters can be flexibly adjusted to explore alternative setups or further experiments.

Citation

If you find this work useful in your research, please consider citing our paper:

@inproceedings{xiang2025jailbreaking,
  title={Jailbreaking the Non-Transferable Barrier via Test-Time Data Disguising},
  author={Xiang, Yongli and Hong, Ziming and Yao, Lina and Wang, Dadong and Liu, Tongliang},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={30671--30681},
  year={2025}
}

Acknowledgement

Parts of this project were inspired by the following projects. We thank their contributors for their excellent work: