✨SeaS✨

August 4, 2025 · View on GitHub

This is an official PyTorch implementation for "SeaS: Few-shot Industrial Anomaly Image Generation with Separation and Sharing Fine-tuning " (SeaS)

Authors: Zhewei Dai1*|Shilei Zeng1*| Haotian Liu1 | Xurui Li1 | Feng Xue3 | Yu Zhou1,2

Institutions: 1Huazhong University of Science and Technology | 2Wuhan JingCe Electronic Group Co.,LTD | 3University of Trento

🧐Arxiv

📣Updates:

08/04/2025

The generated normal images of SeaS is released.

07/23/2025

We have updated some of the code to better adapt to the VisA and MVTec 3D AD datasets. Please update your local files accordingly.

07/20/2025

We have updated the environment configuration file to address some known issues. Please update your local "requirements.txt" file accordingly.

07/16/2025

  1. The complete code of SeaS in paper is released.
  2. The generated image-mask pairs of SeaS is released.
  3. The complete code of downstream segmentation models LFD is released.

📖Catalogue

👇Abstract: [Back to Catalogue]

We introduce SeaS, a unified industrial generative model for automatically creating diverse anomalies, authentic normal products, and precise anomaly masks. While extensive research exists, most efforts either focus on specific tasks, i.e., anomalies or normal products only, or require separate models for each anomaly type. Consequently, prior methods either offer limited generative capability or depend on a vast array of anomaly-specific models. We demonstrate that U-Net's differentiated learning ability captures the distinct visual traits of slightly-varied normal products and diverse anomalies, enabling us to construct a unified model for all tasks. Specifically, we first introduce an Unbalanced Abnormal (UA) Text Prompt, comprising one normal token and multiple anomaly tokens. More importantly, our Decoupled Anomaly Alignment (DA) loss decouples anomaly attributes and binds them to distinct anomaly tokens of UA, enabling SeaS to create unseen anomalies by recombining these attributes. Furthermore, our Normal-image Alignment (NA) loss aligns the normal token to normal patterns, making generated normal products globally consistent and locally varied. Finally, SeaS produces accurate anomaly masks by fusing discriminative U-Net features with high-resolution VAE features. SeaS sets a new benchmark for industrial generation, significantly enhancing downstream applications, with average improvements of +8.66% pixel-level AP for synthesis-based AD approaches, +1.10% image-level AP for unsupervised AD methods, and +12.79% IoU for supervised segmentation models.

pipeline

🎯Setup: [Back to Catalogue]

Environment:

  • Python 3.9
  • CUDA 11.8
  • PyTorch 2.2.1

Clone the repository locally:

git clone https://github.com/HUST-SLOW/SeaS.git

Create virtual environment:

conda create --name seas python=3.9
conda activate seas

Install the required packages:

pip install torch==2.2.1 torchvision==0.17.1
pip install -r requirements.txt

Initialize an Accelerate environment with:

accelerate config

Or for a default accelerate configuration without answering questions about your environment.

accelerate config default

Alternatively, use the pre-configured Accelerate settings in configs/accelerater_config.yaml.

👇Downloads: [Back to Catalogue]

Put all the datasets in ./data folder.

MVTec AD

data
|---mvtec_anomaly_detection
|-----|--- bottle
|-----|-----|--- ground_truth
|-----|-----|--- test
|-----|-----|---|--- broken_large
|-----|-----|---|--- broken_small
|-----|-----|---|--- contamination
|-----|-----|---|--- good
|-----|-----|--- train
|-----|-----|---|--- good
|-----|--- cable
|-----|--- ...

VisA

For VisA we divided the dataset according to defect categories, while dividing the good data in the original dataset into train and good. For downloading the split dataset, please refer to here.

data
|---visa
|-----|--- candle
|-----|-----|--- ground_truth
|-----|-----|--- test
|-----|-----|---|--- chunk_of_wax_missing
|-----|-----|---|--- combined
|-----|-----|---|--- good
|-----|-----|---|--- ...
|-----|-----|--- train
|-----|-----|---|--- good
|-----|--- capsules
|-----|--- ...

MVTec 3D AD

data
|---mvtec_3d_anomaly_detection
|-----|--- bagel
|-----|-----|--- test
|-----|-----|---|--- combined
|-----|-----|---|---|--- gt
|-----|-----|---|---|--- rgb
|-----|-----|---|--- contamination
|-----|-----|---|--- crack
|-----|-----|---|--- good
|-----|-----|---|--- hole
|-----|-----|--- train
|-----|-----|---|--- good
|-----|-----|---|---|--- rgb
|-----|--- cable_gland
|-----|--- ...

After downloading the original MVTec 3D AD dataset, please run this script:

python utils/change_mvtec3d_to_mvtec_type.py

Convert the original dataset into an MVTecAD-like structure.

data
|---mvtec_3d_anomaly_detection
|-----|--- bagel
|-----|-----|--- ground_truth
|-----|-----|--- test
|-----|-----|---|--- combined
|-----|-----|---|--- contamination
|-----|-----|---|--- crack
|-----|-----|---|--- good
|-----|-----|---|--- hole
|-----|-----|--- train
|-----|-----|---|--- good
|-----|--- cable
|-----|--- ...

Stable-Diffusion-v1-4

If you want download stable-diffusion-v1-4, you can use the following command:

pip install huggingface_hub
huggingface-cli download CompVis/stable-diffusion-v1-4 --resume-download --local-dir model_hub/stable-diffusion-v1-4

You can also download the models manually, and put it in ./model_hub folder:

model_hub
|---stable-diffusion-v1-4
|-----|--- feature_extractor
|-----|--- safety_checker
|-----|--- scheduler
|-----|--- text_encoder
|-----|--- tokenizer
|-----|--- unet
|-----|--- vae
|-----|--- model_index.json

💎Run SeaS: [Back to Catalogue]

The SeaS implementation is divided into three stages:

Training anomaly image generation model stage, training mask generation model stage, and inference stage.

Training Anomaly Image Generation Model

bash scripts/train_generation.sh

The configuration in the script ./configs/seas.yaml takes precedence.

The key arguments of the script are as follows:

  • --output_dir: The directory to save checkpoints.
  • --instance_data_dir: The directory of defect images in datasets.
  • --mask_dir: The directory of the corresponding masks of defect images in datasets.
  • --normal_data_dir: The directory of normal images in datasets.
  • --gen_train_steps: Total number of training steps to perform, for the training of image generation model.
  • --checkpointing_steps: Save a checkpoint of the training state every X updates.

The checkpoints of the anomaly image generation model (unet, tokenizer, text_encoder) are saved in ./output_dir/checkpoints/*/generation-checkpoint.

Training Mask Prediction Model

bash scripts/train_mask.sh

The key arguments of the script are as follows:

  • --output_dir: The directory to save checkpoints.
  • --instance_data_dir: The directory of defect images in datasets.
  • --mask_dir: The directory of the corresponding masks of defect images in datasets.
  • --normal_data_dir: The directory of normal images in datasets.
  • --seas_trained_model_path: The directory of seas trained generation models, i.e., the directory of the checkpoints of the generation model.
  • --mask_train_steps: Total number of training steps to perform, for the training of the mask generation model.
  • --checkpointing_steps: Save a checkpoint of the training state every X updates.

The script train_mask.sh takes the path of checkpoints of the anomaly image generation model as input. Then we train the Refined Mask Prediction (RMP) model for the same product as the generation model is trained on, the checkpoints of the RMP model are saved in ./output_dir/checkpoints/*/mask-checkpoint/rmp.

Inference

bash scripts/infer.sh

The key arguments of the script are as follows:

  • --output_dir: The directory to save images.
  • --ref_data_dir: The directory of normal images in datasets.
  • --gen_model_path: The path of the weight of the trained generation model, i.e., the directory of the checkpoints of the generation model.
  • --rmp_model_path: The directory of pretrained models, i.e., the directory of the checkpoints of the Refined Mask Prediction (RMP) model.
  • --prompt: The unbalanced abnormal text prompt.
  • --total_infer_num: The number of generated images.

The script infer.sh takes the path of checkpoints of the anomaly image generation model (--gen_model_path) and the Refined Mask Prediction (RMP) model (--rmp_model_path) as input.

In scripts/infer.sh we provide the broken_large anomaly of product bottle as an example. For the infer.sh for all anomalies of all products, please run this script:

python utils/generate_infer_sh.py

Please change the value of guidance scale in 'configs/seas.yaml' for VisA and MVTec 3D AD datasets respectively.

IC-LPIPS and IS metrics

Testing IC-LPIPS metric

python utils/metrics/metrics_lpips.py

The key arguments of the script are as follows:

  • --output_dir: The directory of the generated images of one anomaly type.
  • --instance_dir: The directory of the images in the real anomaly images of the same anomaly type.

Testing IS metric

We use torch-fidelity to calculate the IS metric. Please refer to torch-fidelity for installation. And run the following command to calculate the Inception Score of a directory of images stored in outputs/images/*, such as outputs/images/bottle/broken_large/images.

fidelity --gpu 0 --isc --input1 outputs/images/*

For comparison with other methods, we resize all images to 256x256 before calculating the IS.

🧐Downstream Segmentation Tasks: [Back to Catalogue]

In the downstream segmentation tasks, we use BiSeNet V2, UperNet and LFD. For BiSeNet V2 and UperNet, we use the official implementation in MMSegmentation. For LFD, we use the official implementation, which can be found in LFD. Considering both model size and performance, we recommend to use LFD.

🧐Downstream Anomaly Detection Tasks: [Back to Catalogue]

In the downstream unsupervised Anomaly Detection tasks, we combine SeaS-generated normal images with HVQ-Trans, PatchCore and MambaAD.

🎖️Results of anomaly image generation: [Back to Catalogue]

We run SeaS on A100 GPU with 22G memory. All the results are implemented by the default settings in our paper. For downloading the generated image-mask pairs, please refer to here.

MethodsMVTec ADVisAMVTec 3D AD
IS \uparrowIC-LPIPS \uparrowKID \downarrowIC-LPIPS(a) \uparrowIS \uparrowIC-LPIPS \uparrowKID \downarrowIC-LPIPS(a) \uparrowIS \uparrowIC-LPIPS \uparrowKID \downarrowIC-LPIPS(a) \uparrow
Crop&Paste1.510.14----------
SDGAN1.710.13----------
Defect-GAN1.690.15
DFMGAN1.720.200.120.141.250.250.240.051.800.290.190.08
AnomalyDiffusion1.800.32-0.121.260.25-0.041.610.22-0.07
Ours1.880.340.040.181.270.260.020.061.950.300.060.09

The generation results of anomaly images and normal images are shown as follows: generation_defect generation_good

🎖️Results of combining generated anomalies with synthesis-based anomaly detection methods: [Back to Catalogue]

MVTec AD

MethodsImage-levelPixel-level
AUROCAPF1-maxAUROCAPF1-maxIoU
DRAEM98.0098.4596.3497.9067.8966.0460.30
DRAEM+SeaS98.6499.4097.8998.1176.5572.7058.87
GLASS99.9299.9899.6099.2774.0970.4257.14
GLASS+SeaS99.9799.9999.8199.2976.8272.3857.45
Average98.9699.2297.9798.5970.9968.2358.72
Average+SeaS99.3199.7098.8598.7076.6972.5458.16

VisA

MethodsImage-levelPixel-level
AUROCAPF1-maxAUROCAPF1-maxIoU
DRAEM86.2885.3081.6692.9217.1522.9513.57
DRAEM+SeaS88.1287.0483.0498.4549.0548.6235.00
GLASS97.6896.8993.0398.4745.5848.3939.92
GLASS+SeaS97.8897.3993.2198.4348.0649.3240.00
Average91.9891.1087.3595.7031.3735.6726.75
Average+SeaS93.0092.2288.1398.4448.5648.9737.50

MVTec 3D AD

MethodsImage-levelPixel-level
AUROCAPF1-maxAUROCAPF1-maxIoU
DRAEM79.1690.9089.7886.7314.0217.0012.42
DRAEM+SeaS85.4593.5890.8595.4320.0926.1017.07
GLASS92.3496.8593.3798.4648.4649.1345.03
GLASS+SeaS92.9597.3893.2198.7348.5549.2846.02
Average85.7593.8891.5892.6031.2433.0728.73
Average+SeaS89.2095.4892.0397.0834.3237.6931.55

🎖️Results of combining generated normal images with unsupervised anomaly detection methods: [Back to Catalogue]

MVTec AD

MethodsImage-levelPixel-level
AUROCAPF1-maxAUROCAPF1-maxIoU
HVQ-Trans96.3898.0995.3097.6047.9553.3245.03
HVQ-Trans + SeaS97.2598.4895.7897.5848.5353.8444.61
PatchCore98.6399.4798.1898.3756.1358.8349.45
PatchCore + SeaS98.6499.4898.2298.3763.9864.0755.43
MambaAD98.5499.5297.7797.6756.2359.3451.31
MambaAD + SeaS98.8099.6498.4097.6656.8659.7051.51
Average97.8599.0397.0897.8853.4457.1648.60
Average(+ SeaS)98.2399.2097.4797.8756.4659.2050.52

VisA

MethodsImage-levelPixel-level
AUROCAPF1-maxAUROCAPF1-maxIoU
HVQ-Trans90.1188.1884.0898.1028.6735.0524.03
HVQ-Trans + SeaS92.1290.3586.2398.1529.5236.0023.60
PatchCore94.8495.9891.6998.3848.5849.6942.44
PatchCore + SeaS94.9796.0691.8198.4148.6049.7242.46
MambaAD94.1994.4489.5598.4939.2744.1837.68
MambaAD + SeaS94.2394.6589.9398.7039.3343.9936.62
Average93.0592.8788.4498.3238.8442.9734.72
Average(+ SeaS)93.7793.6989.3298.4239.1543.2434.23

MVTec 3D AD

MethodsImage-levelPixel-level
AUROCAPF1-maxAUROCAPF1-maxIoU
HVQ-Trans68.1584.3885.2096.4017.2324.5920.51
HVQ-Trans + SeaS71.2690.3589.2396.5619.3426.4020.47
PatchCore83.4494.8992.2498.5534.5239.0939.29
PatchCore + SeaS83.8894.9792.3298.5634.6539.4139.43
MambaAD85.9295.6992.5198.5737.3041.0839.44
MambaAD + SeaS88.6796.6093.4198.7435.4639.5939.51
Average79.1791.6589.9897.8429.6834.9233.08
Average(+ SeaS)81.2793.9791.6597.9529.8235.1333.14

🎖️Results of training supervised segmentation models for anomaly detection and segmentation: [Back to Catalogue]

MVTec AD

Segmentation ModelsGenerative ModelsImage-levelPixel-level
AUROCAPF1-maxAUROCAPF1-maxIoU
DFMGAN90.9094.4390.3394.5760.4260.5445.83
BiSeNet V2AnomalyDiffusion90.0894.8491.8496.2764.5062.2742.89
SeaS96.0098.1495.4397.2169.2166.3755.28
DFMGAN90.7494.4390.3792.3357.0156.9146.64
UperNetAnomalyDiffusion96.6298.6196.2196.8769.9266.9550.80
SeaS98.2999.2097.3497.8774.4270.7061.24
DFMGAN91.0895.4090.5894.9167.0665.0945.49
LFDAnomalyDiffusion95.1597.7894.6696.3069.7766.9945.77
SeaS95.8897.8995.1598.0977.1572.5256.47
DFMGAN90.9194.7590.4393.9461.5060.8545.99
AverageAnomalyDiffusion93.9597.0894.2496.4868.0665.4046.49
SeaS96.7298.4195.9797.7273.5969.8657.66

VisA

Segmentation ModelsGenerative ModelsImage-levelPixel-level
AUROCAPF1-maxAUROCAPF1-maxIoU
DFMGAN63.0762.6366.4875.919.1715.009.66
BiSeNet V2AnomalyDiffusion76.1177.7473.1389.2934.1637.9315.93
SeaS85.6186.6480.4996.0342.8045.4125.93
DFMGAN71.6971.6470.7075.0912.4218.5215.47
UperNetAnomalyDiffusion83.1884.0878.8895.0039.9245.3720.53
SeaS90.3490.7384.3397.0155.4655.9935.91
DFMGAN65.3862.2566.5981.2115.1418.706.44
LFDAnomalyDiffusion81.9782.3677.3588.0030.8638.5616.61
SeaS83.0782.8877.2492.9143.8746.4626.37
DFMGAN66.7165.5167.9277.4012.2417.4110.52
AverageAnomalyDiffusion80.4281.3976.4590.7634.9840.6217.69
SeaS86.3486.7580.6995.3247.3849.2929.40

MVTec 3D AD

Segmentation ModelsGenerative ModelsImage-levelPixel-level
AUROCAPF1-maxAUROCAPF1-maxIoU
DFMGAN61.8881.8084.4475.8915.0221.7315.68
BiSeNet V2AnomalyDiffusion61.4981.3585.3692.3915.1520.0914.70
SeaS73.6087.7585.8290.4126.0432.6128.55
DFMGAN67.5684.5384.9975.1219.5426.0418.78
UperNetAnomalyDiffusion76.5690.4287.3588.4828.9535.8125.04
SeaS82.5792.5988.7291.9338.5143.5338.56
DFMGAN62.2382.1785.3872.159.5414.2914.81
LFDAnomalyDiffusion77.0689.4487.2092.6824.2932.7419.90
SeaS78.9691.2287.2891.6140.2543.4739.00
DFMGAN63.8983.8384.9474.3914.7020.6916.42
AverageAnomalyDiffusion71.7087.0786.6491.1822.8029.5519.88
SeaS78.3890.5287.2791.3234.9339.8735.37

Citation: [Back to Catalogue]

@inproceedings{dai2025SeaS,
  title={SeaS: Few-shot Industrial Anomaly Image Generation with Separation and Sharing Fine-tuning},
  author={Zhewei Dai, Shilei Zeng, Haotian Liu, Xurui Li, Feng Xue, Yu Zhou},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2025}
}

Acknowledgement: [Back to Catalogue]

Our repo is built on Diffusers, thanks their clear and elegant code!

License: [Back to Catalogue]

SeaS is released under the MIT License, and is fully open for academic research and also allow free commercial usage. To apply for a commercial license, please contact yuzhou@hust.edu.cn.