Instance Segmentation of Videos and Live Cameras Feeds in Pytorch

September 24, 2021 · View on GitHub

PixelLib makes it possible to perform real time object segmentation in live camera feeds and video files.

Code for Video Segmentation

import pixellib
from pixellib.torchbackend.instance import instanceSegmentation

ins = instanceSegmentation()
ins.load_model("pointrend_resnet50.pkl")
ins.process_video("sample_video.mp4", show_bboxes=True, frames_per_second=3, output_video_name="output_video.mp4")

Line 1–4: PixelLib package was imported and we also imported the class instanceSegmentation from the the module pixellib.torchbackend.instance (importing instance segmentation class from pytorch support). We created an instance of the class and finally loaded the PointRend model. Download the model from here.

Line 5: We called the function process_video to perform segmentation of objects in videos and the following parameters are added to the function:

video_path: This is the path to the video to be segmented.

show_bboxes: This is an optional parameter to display the segmented objects in the results with bounding boxes.

frames_per_second: This is the parameter that will set the number of frames per second for the saved video.

output_video_name: This is the name of output segmented video.

ins.process_video("sample_video.mp4", show_bboxes=True, frames_per_second=3, output_video_name="output_video.mp4")

Code for Object Extraction in Videos

import pixellib
from pixellib.torchbackend.instance import instanceSegmentation

ins = instanceSegmentation()
ins.load_model("pointrend_resnet50.pkl")
ins.process_video("sample.mp4", show_bboxes=True, extract_segmented_objects=True,
save_extracted_objects=True, frames_per_second=3, output_video_name="output_video.mp4")

ins.process_video("sample_video.mp4", show_bboxes=True,  extract_segmented_objectsframes_per_second=5, output_video_name="output_video.mp4")

The process_video function have new parameters extract_segmented_objects and save_extracted_objects to extract and save segmented objects respectively.

Extraction from Bounding Box Coordinates in Videos

Modified Code for Extraction

import pixellib
from pixellib.torchbackend.instance import instanceSegmentation

ins = instanceSegmentation()
ins.load_model("pointrend_resnet50.pkl")
ins.segmentImage("image.jpg", show_bboxes=True, extract_segmented_objects=True, extract_from_box = True,
save_extracted_objects=True, output_image_name="output_image.jpg" )

ins.process_video("sample.mp4", show_bboxes=True, extract_segmented_objects=True, extract_from_box=True,save_extracted_objects=True, frames_per_second=5, output_video_name="output_video.mp4")

extract_from_box was added to the function to extract the objects segmented from their bounding box coordinates.

Custom Object Segmentation in Videos

PixelLib makes it possible to perform custom object segmentation in videos to filter unused detections and segment target classes. Code for Custom Detection

import pixellib
from pixellib.torchbackend.instance import instanceSegmentation

ins = instanceSegmentation()
ins.load_model("pointrend_resnet50.pkl")
target_classes = ins.select_target_classes(person = True, bicycle =True)
ins.process_video("sample_video.mp4", show_bboxes=True, segment_target_classes = target_classes,
frames_per_second=5, output_video_name="output_video.mp4")

Code for fast Mode Detection in Video Segmentation

import pixellib
from pixellib.torchbackend.instance import instanceSegmentation

ins = instanceSegmentation()
ins.load_model("pointrend_resnet50.pkl", detection_speed = "fast")
ins.process_video("sample_video.mp4", show_bboxes=True, frames_per_second=5, output_video_name="output_video.mp4")

Code for rapid Mode Detection in Video Segmentation

import pixellib
from pixellib.torchbackend.instance import instanceSegmentation

ins = instanceSegmentation()
ins.load_model("pointrend_resnet50.pkl", detection_speed = "rapid")
ins.process_video("sample_video.mp4", show_bboxes=True, frames_per_second=5, output_video_name="output_video.mp4")

Segmentation of Objects in Live Camera Feeds

PixelLib provides an excellent support for Real time Segmentation of Live Camera Feeds.

Code for Segmentation of Live Camera Feeds

import pixellib
from pixellib.torchbackend.instance import instanceSegmentation
import cv2

capture = cv2.VideoCapture(0)

segment_video = instanceSegmentation()
segment_video.load_model("pointrend_resnet50.pkl")
segment_video.process_camera(capture,  show_bboxes = True, frames_per_second= 5, check_fps=True, show_frames= True,
frame_name= "frame", output_video_name="output_video.mp4")

import cv2 capture = cv2.VideoCapture(0)

We imported cv2 and included the code to capture camera's frames.

segment_video.process_camera(capture,  show_bboxes = True, frames_per_second= 5, check_fps=True, show_frames= True,frame_name= "frame", output_video_name="output_video.mp4")

In the code for performing segmentation, we replaced the video's filepath to capture, i.e we are processing a stream of frames captured by the camera. We added extra parameters for the purpose of showing the camera's frames:

show_frames: This is the parameter that handles the showing of segmented camera's frames.

frame_name: This is the name given to the shown camera's frame.

check_fps: This is the parameter that will print out the frames per second at the end of the camera feeds processing.

show_bboxes: This is an optional parameter that shows segmented objects with bounding boxes.

frames_per_second: This is the parameter that sets the number of frames per second for the saved video file. In this case it is set to 5, i.e the saved video file would have 5 frames per second.

output_video_name: This is the name of the saved segmented video.

Speed Adjustments for Live Camera Feeds Processing

The default speed mode reaches 4fps. The fast speed mode reaches 6fps and the rapid speed mode reaches 9fps. These reports are based on using Nvidia GPU with 4GB capacity.

Code for Fast Mode Detection in Camera Feeds

import pixellib
from pixellib.torchbackend.instance import instanceSegmentation
import cv2

capture = cv2.VideoCapture(0)

segment_video = instanceSegmentation()
segment_video.load_model("pointrend_resnet50.pkl", detection_speed = "fast")
segment_video.process_camera(capture,  show_bboxes = True, frames_per_second= 5, check_fps=True, show_frames= True,
frame_name= "frame", output_video_name="output_video.mp4")

Code for Rapid Mode Detection

import pixellib
from pixellib.torchbackend.instance import instanceSegmentation
import cv2

capture = cv2.VideoCapture(0)

segment_video = instanceSegmentation()
segment_video.load_model("pointrend_resnet50.pkl", detection_speed = "rapid")
segment_video.process_camera(capture,  show_bboxes = True, frames_per_second= 5, check_fps=True, show_frames= True,
frame_name= "frame", output_video_name="output_video.mp4")

Code for Custom Object Segmentation in Live Camera Feeds

import pixellib
from pixellib.torchbackend.instance import instanceSegmentation
import cv2

capture = cv2.VideoCapture(0)

segment_video = instanceSegmentation()
segment_video.load_model("pointrend_resnet50.pkl")
target_classes = segment_video.select_target_classes(person=True)
segment_video.process_camera(capture,  show_bboxes = True, frames_per_second= 5, segment_target_classes = target_classes,
show_frames= True,frame_name= "frame", output_video_name="output_video.mp4")

Code for Object Extraction in Live Camera Feeds

import pixellib
from pixellib.torchbackend.instance import instanceSegmentation
import cv2

capture = cv2.VideoCapture(0)

segment_video = instanceSegmentation()
segment_video.load_model("pointrend_resnet50.pkl")
segment_video.process_camera(capture,  show_bboxes = True, frames_per_second= 5, extract_segmented_objects=True, save_extracted_objects=True,
show_frames= True,frame_name= "frame", output_video_name="output_video.mp4")

Code for Object Object Extraction from Box Coordinates in Live Camera Feeds

import pixellib
from pixellib.torchbackend.instance import instanceSegmentation
import cv2

capture = cv2.VideoCapture(0)

segment_video = instanceSegmentation()
segment_video.load_model("pointrend_resnet50.pkl")
segment_video.process_camera(capture,  show_bboxes = True, frames_per_second= 5, extract_segmented_objects=True, extract_from_box=True,
save_extracted_objects=True, show_frames= True,frame_name= "frame", output_video_name="output_video.mp4")