VideoObjectTracking
VideoObjectTracking[video]
detects objects of interest in video and tracks them over video frames.
VideoObjectTracking[objects]
corresponds to and tracks objects, assuming they are from video frames.
VideoObjectTracking[…detector]
uses detector to find objects of interest in the input.
Details and Options
- VideoObjectTracking, also known as visual object tracking, tracks unique objects in frames of a video, trying to handle occlusions if possible. Tracked objects are also known as tracklets.
- Tracking could automatically detect objects in frames or be performed on a precomputed set of objects.
- The result, returned as an ObjectTrackingData object, includes times, labels and various other properties for each tracklet.
- Possible settings for objects and their corresponding outputs are:
-
{{pos11,pos12,…},…} tracking points as kposij {{bbox11,bbox12,…},…} tracking boxes as kbboxij {label1{bbox11,bbox12,…},…,…} tracking boxes as {labeli,j}bbox {lmat1,…} relabeling segments in label matrices lmati {t1obj1,…} a list of times and objects - By default, objects are detected using ImageBoundingBoxes. Possible settings for detector include:
-
f a detector function that returns supported objects "concept" named concept, as used in "Concept" entities "word" English word, as used in WordData wordspec word sense specification, as used in WordData Entity[…] any appropriate entity category1category2… any of the categoryi - Using VideoObjectTracking[{image1,image2,…}] is similar to tracking objects across frames of a video.
- The following options can be given:
-
Method Automatic tracking method to use TargetDevice Automatic the target device on which to perform detection - The possible values for the Method option are:
-
"OCSort" observation-centric SORT (simple, online, real-time) tracking; predicts object trajectories using Kalman estimators "RunningBuffer" offline method, associates objects by comparing a buffer of frames - When tracking label matrices, occlusions are not handled. They can be tracked with Method"RunningBuffer".
- With Method->{"OCSort",subopt}, the following suboptions can be specified:
-
"IOUThreshold" 0.2 intersection over union threshold between bounding boxes "OcclusionThreshold" 8 number of frames for which history of a tracklet is maintained before expiration "ORUHistory" 3 length of tracklet history to step back for tracklet re-update "OCMWeight" 0.2 observation-centric motion weight that accounts for the directionality of moving bounding boxes - With Method->{"RunningBuffer",subopt}, the following suboptions can be specified:
-
"MaxCentroidDistance" Automatic maximum distance between the centroids for adjacent frames "OcclusionThreshold" 8 number of frames for which the history of a tracklet is maintained before expiration - Additional "RunningBuffer" suboptions to specify the contribution to the cost matrix are:
-
"CentroidWeight" 0.5 centroid distance between components or bounding boxes "OverlapWeight" 1 overlap of components or bounding boxes "SizeWeight" Automatic size of components or bounding boxes
Examples
open allclose allBasic Examples (2)
Scope (5)
Data (4)
Options (5)
Method (4)
"OCSort" (3)
In OCSort, motion is predicted using Kalman estimators. Higher values for "OCMWeight" increase the cost when boxes move away from the predicted positions.
Set up a problem with two sets of moving boxes:
By default, direction of trajectories will be very flexible. Notice that in the region of large intersection between blue and red boxes, tracking may change direction suddenly:
Increasing the "WeightOCM" decreases chances of sudden direction changes:
The "IOUThreshold" suboption specifies a threshold for intersection over union between boxes in order to consider them as potentially the same object.
Set up a problem with a set of moving bounding boxes with a gap:
A higher threshold for intersection splits the object trajectory into two parts:
A lower "IOUThreshold" merges the trajectories:
The "OcclusionThreshold" suboption deals with objects that disappear for some time (due to poor detection or occlusion).
Set up a problem with a moving bounding box and remove a couple of frames from the trajectory:
Without an occlusion threshold, the object is not re-associated to the tracklet once it re-emerges:
With a higher occlusion threshold defined, the trajectory is linked back:
TargetDevice (1)
By default, if no detection function is specified, the detection is performed on CPU:
Set the TargetDevice option to "GPU" to perform the detection on GPU:
Applications (12)
Basic Uses (3)
Detect and track objects in a video:
Highlight objects on the video; notice all are labeled with their detected classes:
Highlight tracked detected objects with their corresponding indices:
Track labeled components from matrices:
Define a segmentation function that works on each frame:
Segment video frames and show components:
Track the components across frames and show tracked components:
Use the YOLO V8 network from the Wolfram Neural Net Repository to perform the detection:
Retrieve the network and its evaluation function:
Count Objects (3)
Extract Tracked Objects (1)
Visualize Motion Trajectories (1)
Analyze Wildlife Videos (3)
Text
Wolfram Research (2025), VideoObjectTracking, Wolfram Language function, https://reference.wolfram.com/language/ref/VideoObjectTracking.html.
CMS
Wolfram Language. 2025. "VideoObjectTracking." Wolfram Language & System Documentation Center. Wolfram Research. https://reference.wolfram.com/language/ref/VideoObjectTracking.html.
APA
Wolfram Language. (2025). VideoObjectTracking. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/VideoObjectTracking.html