Data Pre-Processing Document
June 12, 2024 · View on GitHub
Data Pre-Processing Document
This document mainly illustrates how to use several off-the-shelf toolkits to automatically perform data pre-processing.
- 1. Overview
- 2. Prepare Datasets
- 3. Generate Canny Edge
- 4. Generate HED Edge
- 5. Generate Color Stroke
- 6. Generate Image Palette
- 7. Generate Saliency Mask
Overview
We use the following toolkits to perform data pre-processing of LaCon:
| Condition | Script | Model Weights |
|---|---|---|
| Canny Edge | generate-canny.py | - |
| HED Edge | bdcn-edge-detection/generate-bdcn-edge.py | data-preprocessing/bdcn.pth |
| Color Stroke | generate-stroke.pth | - |
| Image Palette | generate-palette.py | - |
| Saliency Mask | u2net-saliency-detection/generate-saliency-mask.py | data-preprocessing/u2net.pth |
Before you start data pre-processing with the above toolkits, please download the toolkit model weights from our Huggingface repo and place them in bdcn-edge-detection/checkpoints and u2net-saliency-detection/checkpoints.
Prepare Datasets
Before you start to prepare the conditions, you need to download the source images of different datasets. You can refer to the following links:
Generate Canny Edge
Execute the following command line to extract Canny edge maps from images in a folder:
python data-preprocessing/generate-canny.py --indir IMAGE_PATH --outdir CANNY_PATH --threshold1 CANNY_THRESHOLD_ONE --threshold2 CANNY_THRESHOLD_TWO
The extracted Canny edge maps will be saved in CANNY_PATH.
You can refer to this example command line:
python data-preprocessing/generate-canny.py --indir data/coco2017val/images --outdir data/coco2017val/canny-edges --threshold1 200 --threshold2 225
Generate HED Edge
- Download the model weights of BDCN edge extractor from this link, and place the weights in
data-preprocessing/bdcn-edge-detection/checkpoints. - Execute the following command line to extract HED edge maps from images in a folder:
python data-preprocessing/bdcn-edge-detection/generate-bdcn-edge.py --indir IMAGE_PATH --outdir HED_EDGE_PATH
The extracted HED edge maps will be saved in HED_EDGE_PATH.
You can refer to this example command line:
python data-preprocessing/bdcn-edge-detection/generate-bdcn-edge.py --indir data/coco2017val/images --outdir data/coco2017val/bdcn-edges
Generate Color Stroke
Execute the following command line to extract color strokes from images in a folder:
python data-preprocessing/generate-stroke.py --indir IMAGE_PATH --outdir OUTPUT_PATH --kmeans_center K_MEANS_CENTER_NUMBER
You can refer to this example command line:
python data-preprocessing/generate-stroke.py --indir data/coco2017val/images --outdir data/coco2017val/color-strokes --kmeans_center 16
By changing --kmeans_center, you can control the number of color in the generated strokes; by turning on --visualize_intermediate, you can visualize the intermediate results (i.e., source image, filtered image, and generated strokes) during color stroke generation. The intermediate results will be saved in OUTPUT_PATH/intermediates.
Generate Image Palette
Execute the following command line to extract image palettes from images in a folder:
python data-preprocessing/generate-palette.py --indir IMAGE_PATH --outdir OUTPUT_PATH --size BICUBIC_SIZE --image_resolution IMAGE_RESOLUTION
You can refer to this example command line:
python data-preprocessing/generate-palette.py --indir data/coco2017val/images --outdir data/coco2017val/image-palette --size 8 --image_resolution 512
By changing --size, you can control the image size of Bicubic down-sampled result; by changing --image_resolution, you can define the original image resolution; by turning on --visualize_intermediate, you can visualize the intermediate results (i.e., source image, Bicubic down-sampled image, and generated palette) during image palette generation. The intermediate results will be saved in OUTPUT_PATH/intermediates.
Generate Saliency Mask
- Download the model weights of U2-Net from this link, and place the weights in
u2net-saliency-detection/checkpoints. - Execute the following command line to extract saliency masks from images in a folder:
python data-preprocessing/u2net-saliency-detection/generate-saliency-mask.py --indir IMAGE_PATH --outdir OUTPUT_PATH
You can refer to this example command line:
python data-preprocessing/u2net-saliency-detection/generate-saliency-mask.py --indir data/coco2017val/images --outdir data/coco2017val/saliency-masks