README.md
December 19, 2022 · View on GitHub
SCARF: Capturing and Animation of Body and Clothing from Monocular Video
cropped image, subject segmentation, clothing segmentation, SMPL-X estimation
Getting Started
Environment
SCARF needs input image, subject mask, clothing mask, and inital SMPL-X estimation for training. Specificly, we use
- FasterRCNN to detect the subject and crop image
- RobustVideoMatting to remove background
- cloth-segmentation to segment clothing
- PIXIE to estimate SMPL-X parameters
When using the processing script, it is necessary to agree to the terms of their licenses and properly cite them in your work.
- Clone submodule repositories:
git submodule update --init --recursive
- Download their needed data:
bash fetch_asset_data.sh
If the script failed, please check their websites and download the models manually.
process video data
Put your data list into ./lists/subject_list.txt, it can be video path or image folders.
Then run
python process_video.py --crop --ignore_existing
Processing time depends on the number of frames and the size of video, for mpiis-scarf video (with 400 frames and resolution 1028x1920), need around 12min.
Video Data
The script has been verified to work for datasets:
a. mpiis-scarf (recorded video for this paper)
b. People Snapshot Dataset (https://graphics.tu-bs.de/people-snapshot)
c. SelfRecon dataset (https://jby1993.github.io/SelfRecon/)
d. iPER dataset (https://svip-lab.github.io/dataset/iPER_dataset.html)
To get the optimal results for your customized video, it is recommended to capture the video using similar settings as the datasets mentioned above.
This means keeping the camera static, recording the subject with more views, and using uniform lighting. And better to have less than 1000 frames for training. For more information, please refer to the limitations section of SCARF.