compile with x265, vp9, vvc, av1 that used in our paper, make sure you install corresponding codec beforehand
July 30, 2025 ยท View on GitHub
How2Compress: Scalable and Efficient Edge Video Analytics via Adaptive Granular Video Compression
Yuheng Wu, Thanh-Tung Nguyen, Lucas Liebe, Nhat-Quang Tau, Pablo Espinosa Campos, Jinghan Cheng, Dongman Lee
@inproceedings{wu2025how2compress,
title={How2Compress: Scalable and Efficient Edge Video Analytics via Adaptive Granular Video Compression},
author={Wu, Yuheng and Nguyen, Thanh-Tung and Lucas Liebe and Tau, Nhat-Quang and Pablo Espinosa Campos and Cheng, Jinghan and Lee, Dongman},
booktitle={Proceedings of the 33rd ACM International Conference on Multimedia},
year={2025}
}
This repository contains the official implementation of the paper: "How2Compress: Scalable and Efficient Edge Video Analytics via Adaptive Granular Video Compression"
Env Setup
We provide two options for setting up the environment: using a pre-built Docker image or building from scratch.
[option1] Docker Image (Recommended)
We provide a Docker image to facilitate quick reproduction of our experiments and results. The Docker image is available at the following link: https://hub.docker.com/r/wuyuheng/how2compress
We strongly recommend using the Docker image to avoid the complexity of manual environment configuration. Some codes such as Nvidia Video SDK, please refer to the implementaion.
[option2] Build From Scratch
- Python Environment Setup
# Using pip
pip install -r how2compress-requirements.txt
# Or using conda
conda env create -f how2compress-env.yaml
- Install Advanced Video Codecs
wget https://ffmpeg.org/releases/ffmpeg-7.1.1.tar.xz
tar -xf ffmpeg-7.1.1.tar.xz
# compile with x265, vp9, vvc, av1 that used in our paper, make sure you install corresponding codec beforehand
./configure \
--enable-shared \
--enable-libx264 \
--enable-libx265 \
--enable-libmp3lame \
--enable-libopus \
--enable-libvpx \
--enable-libaom \
--enable-libvvenc \
--enable-gpl \
--enable-nonfree \
--disable-x86asm
make -j$(nproc)
sudo make install
Ensure you have installed all required codec libraries (e.g., libx265, libvpx, libaom, etc.) before building.
- Build the Modified H.264 Codec
Navigate to the myh264 directory and execute:
cd myh264/
bash build.sh
If you encounter errors related to missing .so files, set the appropriate library paths as follows (replace paths with actual locations):
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/myh264/ffmpeg-3.4.8/libavcodec:/myh264/ffmpeg-3.4.8/libavdevice/:/myh264/ffmpeg-3.4.8/libavfilter:/myh264/ffmpeg-3.4.8/libavformat:/myh264/ffmpeg-3.4.8/libavresampler:/myh264/ffmpeg-3.4.8/libavutil:/myh264/ffmpeg-3.4.8/libpostproc:/myh264/ffmpeg-3.4.8/libswresample:/myh264/ffmpeg-3.4.8/libswscale:/myh264/x264
export PATH=$PATH:/myh264/x264
export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/myh264/x264
- Compile NVENC for Fast Training-Side Encoding
cmake .. \
-DCUVID_LIB=/how2compress/nvenc_codec_sdk_11.0.10/Lib/linux/stubs/x86_64/libnvcuvid.so \
-DNVENCODEAPI_LIB=/how2compress/nvenc_codec_sdk_11.0.10/Lib/linux/stubs/x86_64/libnvidia-encode.so \
-DCMAKE_INCLUDE_PATH=/how2compress/nvenc_codec_sdk_11.0.10/Interface
Note: You must comply with NVIDIA's Video Codec SDK licensing terms. Our implementation modifies NVENC to support the emphasis map feature as in https://docs.nvidia.com/video-technologies/video-codec-sdk/11.1/nvenc-video-encoder-api-prog-guide/index.html#emphasis-map
Train
Before training, we pre-encode and convert the frames into video chunks. Please refer to
prepare_dataset/and convert data format by yourself. Due to the large volumne of dataset and anonymous requirement, it's hard to find an appropriate way to share, we will release later via huggingface dataset.
All training parameters are configurable in the config/ directory.
How2compress (Ours):
# MOT dataset
torchrun --nproc_per_node=3 train_mb_det.py --config <configs/mot1702.yaml>
# AICITY dataset
torchrun --nproc_per_node=3 train_mb_det_aicity.py --config <configs/aicity.yaml>
To reproduce baselines:
Where2Compress (ie, AccMPEG):
# tune parameter in each file here
# MOT dataset
torchrun --nproc_per_node=3 train_mb_det_accmpeg.py
# AICITY dataset
torchrun --nproc_per_node=3 train_mb_det_accmpeg_aicity.py
For When2Compress (CASVA, ILCAS): Since our focus is on quality adjustment (not resolution or frame rate), we perform an exhaustive search of frame-level QP values in the range [28, 35].
Reproducing Our Results
Evaluation
To reproduce results from our paper:
-
After training, ensure all checkpoints are saved in the
pretrained/directory. -
Use the scripts under
eval/to re-run the experiments that produce each table and figure in the paper. -
After evaluation, you will obtain all compressed video outputs categorized accordingly.
-
Use the script
reproduce/compute_resultto compute the statistics used for validation.
Visualization
To reproduce the figures from the paper, refer to the scripts in reproduce/draw/, which contain all relevant figure generation source code.
Tips
The current codebase is somewhat disorganized, but we are working to clean and update it as soon as possible. To reproduce our results, follow these general steps:
Data Preparation: Convert your video data or frames into the YUV format. Ensure that the frame width and height are divisible by 16 to maintain compatibility with standard codecs. You can refer to the prepare_dataset.py script (files with this prefix) for guidance on YUV generation.
Script Integration: Once the YUV files are ready, align the video paths with the training script accordingly.
Note: Due to the inherent non-determinism of our algorithm, the compression rate you obtain may not exactly match the results reported in our paper. However, the method consistently outperforms coarse-grained baselines across multiple runs. We observed minor variations in compression rate across trials, which is expected.
In some cases, the naive codec Adaptive Quantization (AQ) may outperform our method, as reported in the main table of our paper. This is primarily because naive AQ can generate more skip-mode macroblocks (MBs), which are highly efficient for compression.
Acknowledgements
We would like to express our sincere gratitude to the authors of AccMPEG for their outstanding work. Our research builds significantly upon their open-source contributions, without which this work would not have been possible.
Please refer to AccMPEG at: https://github.com/KuntaiDu/AccMPEG