NetsPresso tutorial for YOLOv7 compression
January 24, 2025 · View on GitHub
NetsPresso tutorial for YOLOv7 compression
Order of the tutorial
0. Sign up
1. Install
2. Training
3. Transfer Training
4. Convert YOLOv7 to yolov7_fx.pt 1
5. Model compression with NetsPresso Python Package
6. Restore the compressed model to the original model structure
7. Retrain the compressed model
8. NetsPresso Re-parameterization
9. Convert YOLOv7 to yolov7_fx.pt 2
0. Sign up
To get started with the NetsPresso Python package, you will need to sign up at NetsPresso.
1. Install
Clone repo and install requirements.txt in a Python>=3.7.0 environment, including PyTorch >= 1.11, < 2.0.
git clone https://github.com/Nota-NetsPresso/ModelZoo-YOLOv7.git # clone
cd ModelZoo-YOLOv7
pip install -r requirements.txt
2. Training
Data preparation
bash scripts/get_coco.sh
- Download MS COCO dataset images (train, val, test) and labels. If you have previously used a different version of YOLO, we strongly recommend that you delete
train2017.cacheandval2017.cachefiles, and redownload labels
Single GPU training
# train p5 models
python train.py --workers 8 --device 0 --batch-size 32 --data data/coco.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7 --hyp data/hyp.scratch.p5.yaml
# train p6 models
python train_aux.py --workers 8 --device 0 --batch-size 16 --data data/coco.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6.yaml --weights '' --name yolov7-w6 --hyp data/hyp.scratch.p6.yaml
Multiple GPU training
# train p5 models
python -m torch.distributed.launch --nproc_per_node 4 --master_port 9527 train.py --workers 8 --device 0,1,2,3 --sync-bn --batch-size 128 --data data/coco.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7 --hyp data/hyp.scratch.p5.yaml
# train p6 models
python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train_aux.py --workers 8 --device 0,1,2,3,4,5,6,7 --sync-bn --batch-size 128 --data data/coco.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6.yaml --weights '' --name yolov7-w6 --hyp data/hyp.scratch.p6.yaml
3. Transfer learning
yolov7_training.pt yolov7x_training.pt yolov7-w6_training.pt yolov7-e6_training.pt yolov7-d6_training.pt yolov7-e6e_training.pt
Single GPU finetuning for custom dataset
# finetune p5 models
python train.py --workers 8 --device 0 --batch-size 32 --data data/custom.yaml --img 640 640 --cfg cfg/training/yolov7-custom.yaml --weights 'yolov7_training.pt' --name yolov7-custom --hyp data/hyp.scratch.custom.yaml
# finetune p6 models
python train_aux.py --workers 8 --device 0 --batch-size 16 --data data/custom.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6-custom.yaml --weights 'yolov7-w6_training.pt' --name yolov7-w6-custom --hyp data/hyp.scratch.custom.yaml
4. Convert YOLOv7 to yolov7_fx.pt 1
python export_netspresso.py --weights yolov7_training.pt --data data/coco.yaml
Executing this code will create 'yolov7_fx.pt'.
5. Model compression with NetsPresso Python Package
Upload & compress your 'yolov7_fx.pt' by using NetsPresso Python Package
5_1. Install NetsPresso Python Package
pip install netspresso
5_2. Upload & compress
First, import the packages and set a NetsPresso username and password.
from netspresso import NetsPresso
netspresso = NetsPresso(email="YOUR_EMAIL", password="YOUR_PASSWORD")
Second, upload 'model_to_compress.pt', which is the model converted to torchfx in step 4, with the following code.
# 1. Declare compressor
compressor = netspresso.compressor_v2()
# 2. Run automatic compression
compression_result = compressor.automatic_compression(
input_shapes=[{"batch": 1, "channel": 3, "dimension": [640, 640]}],
input_model_path="./yolov7_fx.pt",
output_dir="./yolov7_L206.pt",
compression_ratio=0.5,
)
More commands can be found in the official NetsPresso Python Package docs: https://nota-netspresso.github.io/PyNetsPresso-docs
Alternatively, you can do the same as above through the GUI on our website: https://console.netspresso.ai/models
6. Restore the compressed model to the original model structure
The compressed model is restored to the original model structure through the following code. This will create a fx2p_complete.pt file.
python yolov7_fx2p.py --original yolov7_training.pt --compressed yolov7_L206.pt --detect 105
7. Retrain the compressed model
The compressed model is restored to the original model structure through the following code.
python train.py --netspresso --workers 8 --device 0 --batch-size 32 --data data/coco.yaml --img 640 640 --weights fx2p_complete.pt --name yolov7 --hyp data/hyp.scratch.p5.yaml
8. NetsPresso Re-parameterization
See netspresso_reparameterization.ipynb
9. Convert YOLOv7 to yolov7_fx.pt 2
If you want to compress the compressed model?
python export_netspresso.py --netspresso --weights fx2p_complete.pt --data data/coco.yaml
Start with the following code and repeat steps 5, 6, and 7!
Now you can use the compressed model however you like!
Contact
Join our Discussion Forum for providing feedback or sharing your use cases, and if you want to talk more with Nota, please contact us here.
Or you can also do it via email(contact@nota.ai) or phone(+82 2-555-8659)!
Official YOLOv7
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Web Demo
- Integrated into Huggingface Spaces 🤗 using Gradio. Try out the Web Demo
Performance
MS COCO
| Model | Test Size | APtest | AP50test | AP75test | batch 1 fps | batch 32 average time |
|---|---|---|---|---|---|---|
| YOLOv7 | 640 | 51.4% | 69.7% | 55.9% | 161 fps | 2.8 ms |
| YOLOv7-X | 640 | 53.1% | 71.2% | 57.8% | 114 fps | 4.3 ms |
| YOLOv7-W6 | 1280 | 54.9% | 72.6% | 60.1% | 84 fps | 7.6 ms |
| YOLOv7-E6 | 1280 | 56.0% | 73.5% | 61.2% | 56 fps | 12.3 ms |
| YOLOv7-D6 | 1280 | 56.6% | 74.0% | 61.8% | 44 fps | 15.0 ms |
| YOLOv7-E6E | 1280 | 56.8% | 74.4% | 62.1% | 36 fps | 18.7 ms |
Installation
Docker environment (recommended)
Expand
# create the docker container, you can change the share memory size if you have more.
nvidia-docker run --name yolov7 -it -v your_coco_path/:/coco/ -v your_code_path/:/yolov7 --shm-size=64g nvcr.io/nvidia/pytorch:21.08-py3
# apt install required packages
apt update
apt install -y zip htop screen libgl1-mesa-glx
# pip install required packages
pip install seaborn thop
# go to code folder
cd /yolov7
Testing
yolov7.pt yolov7x.pt yolov7-w6.pt yolov7-e6.pt yolov7-d6.pt yolov7-e6e.pt
python test.py --data data/coco.yaml --img 640 --batch 32 --conf 0.001 --iou 0.65 --device 0 --weights yolov7.pt --name yolov7_640_val
You will get the results:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.51206
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.69730
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.55521
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.35247
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.55937
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.66693
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.38453
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.63765
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.68772
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.53766
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.73549
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.83868
To measure accuracy, download COCO-annotations for Pycocotools to the ./coco/annotations/instances_val2017.json
Training
Data preparation
bash scripts/get_coco.sh
- Download MS COCO dataset images (train, val, test) and labels. If you have previously used a different version of YOLO, we strongly recommend that you delete
train2017.cacheandval2017.cachefiles, and redownload labels
Single GPU training
# train p5 models
python train.py --workers 8 --device 0 --batch-size 32 --data data/coco.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7 --hyp data/hyp.scratch.p5.yaml
# train p6 models
python train_aux.py --workers 8 --device 0 --batch-size 16 --data data/coco.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6.yaml --weights '' --name yolov7-w6 --hyp data/hyp.scratch.p6.yaml
Multiple GPU training
# train p5 models
python -m torch.distributed.launch --nproc_per_node 4 --master_port 9527 train.py --workers 8 --device 0,1,2,3 --sync-bn --batch-size 128 --data data/coco.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7 --hyp data/hyp.scratch.p5.yaml
# train p6 models
python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train_aux.py --workers 8 --device 0,1,2,3,4,5,6,7 --sync-bn --batch-size 128 --data data/coco.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6.yaml --weights '' --name yolov7-w6 --hyp data/hyp.scratch.p6.yaml
Transfer learning
yolov7_training.pt yolov7x_training.pt yolov7-w6_training.pt yolov7-e6_training.pt yolov7-d6_training.pt yolov7-e6e_training.pt
Single GPU finetuning for custom dataset
# finetune p5 models
python train.py --workers 8 --device 0 --batch-size 32 --data data/custom.yaml --img 640 640 --cfg cfg/training/yolov7-custom.yaml --weights 'yolov7_training.pt' --name yolov7-custom --hyp data/hyp.scratch.custom.yaml
# finetune p6 models
python train_aux.py --workers 8 --device 0 --batch-size 16 --data data/custom.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6-custom.yaml --weights 'yolov7-w6_training.pt' --name yolov7-w6-custom --hyp data/hyp.scratch.custom.yaml
Re-parameterization
Inference
On video:
python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --source yourvideo.mp4
On image:
python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --source inference/images/horses.jpg
Export
Pytorch to CoreML (and inference on MacOS/iOS)
Pytorch to ONNX with NMS (and inference)
python export.py --weights yolov7-tiny.pt --grid --end2end --simplify \
--topk-all 100 --iou-thres 0.65 --conf-thres 0.35 --img-size 640 640 --max-wh 640
Pytorch to TensorRT with NMS (and inference)
wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-tiny.pt
python export.py --weights ./yolov7-tiny.pt --grid --end2end --simplify --topk-all 100 --iou-thres 0.65 --conf-thres 0.35 --img-size 640 640
git clone https://github.com/Linaom1214/tensorrt-python.git
python ./tensorrt-python/export.py -o yolov7-tiny.onnx -e yolov7-tiny-nms.trt -p fp16
Pytorch to TensorRT another way
Expand
wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-tiny.pt
python export.py --weights yolov7-tiny.pt --grid --include-nms
git clone https://github.com/Linaom1214/tensorrt-python.git
python ./tensorrt-python/export.py -o yolov7-tiny.onnx -e yolov7-tiny-nms.trt -p fp16
# Or use trtexec to convert ONNX to TensorRT engine
/usr/src/tensorrt/bin/trtexec --onnx=yolov7-tiny.onnx --saveEngine=yolov7-tiny-nms.trt --fp16
Tested with: Python 3.7.13, Pytorch 1.12.0+cu113
Pose estimation
See keypoint.ipynb.
Instance segmentation (with NTU)
See instance.ipynb.
Instance segmentation
YOLOv7 for instance segmentation (YOLOR + YOLOv5 + YOLACT)
| Model | Test Size | APbox | AP50box | AP75box | APmask | AP50mask | AP75mask |
|---|---|---|---|---|---|---|---|
| YOLOv7-seg | 640 | 51.4% | 69.4% | 55.8% | 41.5% | 65.5% | 43.7% |
Anchor free detection head
YOLOv7 with decoupled TAL head (YOLOR + YOLOv5 + YOLOv6)
| Model | Test Size | APval | AP50val | AP75val |
|---|---|---|---|---|
| YOLOv7-u6 | 640 | 52.6% | 69.7% | 57.3% |
Citation
@article{wang2022yolov7,
title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors},
author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
journal={arXiv preprint arXiv:2207.02696},
year={2022}
}
@article{wang2022designing,
title={Designing Network Design Strategies Through Gradient Path Analysis},
author={Wang, Chien-Yao and Liao, Hong-Yuan Mark and Yeh, I-Hau},
journal={arXiv preprint arXiv:2211.04800},
year={2022}
}
Teaser
YOLOv7-semantic & YOLOv7-panoptic & YOLOv7-caption
YOLOv7-semantic & YOLOv7-detection & YOLOv7-depth (with NTUT)
YOLOv7-3d-detection & YOLOv7-lidar & YOLOv7-road (with NTUT)
Acknowledgements
Expand
- https://github.com/AlexeyAB/darknet
- https://github.com/WongKinYiu/yolor
- https://github.com/WongKinYiu/PyTorch_YOLOv4
- https://github.com/WongKinYiu/ScaledYOLOv4
- https://github.com/Megvii-BaseDetection/YOLOX
- https://github.com/ultralytics/yolov3
- https://github.com/ultralytics/yolov5
- https://github.com/DingXiaoH/RepVGG
- https://github.com/JUGGHM/OREPA_CVPR2022
- https://github.com/TexasInstruments/edgeai-yolov5/tree/yolo-pose
