videoobjectsegmenting_mapper
February 4, 2026 · View on GitHub
Text-guided semantic segmentation of valid objects throughout the video (YOLOE + SAM2).
在整个视频中对有效物体进行文本引导的语义分割(YOLOE + SAM2)。
Type 算子类型: mapper
Tags 标签: gpu, hf, video
🔧 Parameter Configuration 参数配置
| name 参数名 | type 类型 | default 默认值 | desc 说明 |
|---|---|---|---|
sam2_hf_model | <class 'str'> | 'facebook/sam2.1-hiera-tiny' | |
yoloe_path | <class 'str'> | 'yoloe-11l-seg.pt' | The path to the YOLOE model. |
yoloe_conf | <class 'float'> | 0.5 | Confidence threshold for YOLOE object detection. |
torch_dtype | <class 'str'> | 'bf16' | The floating point type used for model inference. Can be one of ['fp32', 'fp16', 'bf16']. |
if_binarize | <class 'bool'> | True | Whether the final mask requires binarization. If 'if_save_visualization' is set to True, 'if_binarize' will automatically be adjusted to True. |
if_save_visualization | <class 'bool'> | False | Whether to save visualization results. |
save_visualization_dir | <class 'str'> | DATA_JUICER_ASSETS_CACHE | The path for saving visualization results. |
args | '' | ||
kwargs | '' |
📊 Effect demonstration 效果演示
test
VideoObjectSegmentingMapper(sam2_hf_model='facebook/sam2.1-hiera-tiny', yoloe_path='yoloe-11l-seg.pt', yoloe_conf=0.2, torch_dtype='bf16', if_binarize=True, if_save_visualization=False)
📥 input data 输入数据
Sample 1: 1 video
video4.mp4:
Sample 2: 1 video
video3.mp4:
📤 output data 输出数据
Sample 1: empty
Sample 2: empty