VideoObjectTracking

VideoObjectTracking[video]

detects objects of interest in video and tracks them over video frames.

VideoObjectTracking[objects]

corresponds to and tracks objects, assuming they are from video frames.

VideoObjectTracking[…detector]

uses detector to find objects of interest in the input.

Details and Options

VideoObjectTracking, also known as visual object tracking, tracks unique objects in frames of a video, trying to handle occlusions if possible. Tracked objects are also known as tracklets.
Tracking could automatically detect objects in frames or be performed on a precomputed set of objects.
The result, returned as an ObjectTrackingData object, includes times, labels and various other properties for each tracklet.
Possible settings for objects and their corresponding outputs are:

	{{pos₁₁,pos₁₂,…},…}	tracking points as kpos_ij
	{{bbox₁₁,bbox₁₂,…},…}	tracking boxes as kbbox_ij
	{label₁{bbox₁₁,bbox₁₂,…},…,…}	tracking boxes as {label_i,j}bbox
	{lmat₁,…}	relabeling segments in label matrices lmat_i
	{t₁obj₁,…}	a list of times and objects

By default, objects are detected using ImageBoundingBoxes. Possible settings for detector include:

	f	a detector function that returns supported objects
	"concept"	named concept, as used in "Concept" entities
	"word"	English word, as used in WordData
	wordspec	word sense specification, as used in WordData
	Entity[…]	any appropriate entity
	category₁\|category₂\|…	any of the category_i

Using VideoObjectTracking[{image₁,image₂,…}] is similar to tracking objects across frames of a video.
The following options can be given:
Method Automatic tracking method to use

TargetDevice Automatic the target device on which to perform detection
The possible values for the Method option are:

	"OCSort"	observation-centric SORT (simple, online, real-time) tracking; predicts object trajectories using Kalman estimators
	"RunningBuffer"	offline method, associates objects by comparing a buffer of frames

When tracking label matrices, occlusions are not handled. They can be tracked with Method"RunningBuffer".
With Method->{"OCSort",subopt}, the following suboptions can be specified:

"IOUThreshold"	0.2	intersection over union threshold between bounding boxes
"OcclusionThreshold"	8	number of frames for which history of a tracklet is maintained before expiration
"ORUHistory"	3	length of tracklet history to step back for tracklet re-update
"OCMWeight"	0.2	observation-centric motion weight that accounts for the directionality of moving bounding boxes

With Method->{"RunningBuffer",subopt}, the following suboptions can be specified:

	"MaxCentroidDistance"	Automatic	maximum distance between the centroids for adjacent frames
	"OcclusionThreshold"	8	number of frames for which the history of a tracklet is maintained before expiration

Additional "RunningBuffer" suboptions to specify the contribution to the cost matrix are:

"CentroidWeight"	0.5	centroid distance between components or bounding boxes
"OverlapWeight"	1	overlap of components or bounding boxes
"SizeWeight"	Automatic	size of components or bounding boxes

Examples

open allclose all

Basic Examples (2)

Detect and track objects in a video:

Detect and track faces in a video:

Extract the first frame from each sub-video:

Scope (5)

Data (4)

Detect and track objects in a video:

Detect and track objects in a list of images:

Track a list of bounding boxes:

Track a list of points:

Detectors (1)

Automatically detect objects and track them:

Specify a detector function to find objects:

Specify the category of object to detect and track:

Options (5)

Method (4)

"OCSort" (3)

In OCSort, motion is predicted using Kalman estimators. Higher values for "OCMWeight" increase the cost when boxes move away from the predicted positions.

Set up a problem with two sets of moving boxes:

By default, direction of trajectories will be very flexible. Notice that in the region of large intersection between blue and red boxes, tracking may change direction suddenly:

Increasing the "WeightOCM" decreases chances of sudden direction changes:

The "IOUThreshold" suboption specifies a threshold for intersection over union between boxes in order to consider them as potentially the same object.

Set up a problem with a set of moving bounding boxes with a gap:

A higher threshold for intersection splits the object trajectory into two parts:

A lower "IOUThreshold" merges the trajectories:

The "OcclusionThreshold" suboption deals with objects that disappear for some time (due to poor detection or occlusion).

Set up a problem with a moving bounding box and remove a couple of frames from the trajectory:

Without an occlusion threshold, the object is not re-associated to the tracklet once it re-emerges:

With a higher occlusion threshold defined, the trajectory is linked back:

"RunningBuffer" (1)

The "RunningBuffer" method can typically better track objects whose trajectories have a jump due to occlusion or fast movement:

The "OCSort" method results in different instances of the same hummingbird:

"RunningBuffer" links the trajectories together to track the bird as one:

TargetDevice (1)

By default, if no detection function is specified, the detection is performed on CPU:

Set the TargetDevice option to "GPU" to perform the detection on GPU:

Applications (12)

Basic Uses (3)

Detect and track objects in a video:

Highlight objects on the video; notice all are labeled with their detected classes:

Track the detected objects:

Highlight tracked detected objects with their corresponding indices:

Track labeled components from matrices:

Define a segmentation function that works on each frame:

Segment video frames and show components:

Track the components across frames and show tracked components:

Use the YOLO V8 network from the Wolfram Neural Net Repository to perform the detection:

Retrieve the network and its evaluation function:

Detect and track the object using the YOLO V8 network:

Highlight the tracked detected objects:

Count Objects (3)

Count the number of detected objects in a video:

Track objects and find unique instances:

Get the final counts:

Count occurrences of a specific object:

Track objects and find unique instances:

Get the final counts:

Count the number of elephants in a video:

Extract Tracked Objects (1)

Detect and track the contents of a video:

Extract the first of the detected labels:

Extract the sub-video corresponding to the first tracked object:

Visualize Motion Trajectories (1)

Track pedestrians in a railway station:

Detect the bounding boxes and show them over the original video:

Track the boxes:

Plot the trajectories of the centroids of the boxes:

Overlay the trajectories onto the original video:

Analyze Wildlife Videos (3)

Track a herd of migrating elephants:

Highlight frames with the tracked elephants:

Track a herd of galloping horses:

Track a flock of sheep entering a barn:

Analyze Human Videos (1)

Estimate age from the face of each person in a video:

Detect and track faces:

Find tracked faces with longest duration in the video:

Detect and track faces:

Find tracked faces with longest duration in the video:

Construct a timeseries of selected labels:

Compute facial emotions for each each tracked face:

Compute median estimated age for each face:

Top

	Method	Automatic	tracking method to use
	TargetDevice	Automatic	the target device on which to perform detection

VideoObjectTracking

Details and Options

Examples

Basic Examples (2)

Scope (5)

Data (4)

Detectors (1)

Options (5)

Method (4)

"OCSort" (3)

"RunningBuffer" (1)

TargetDevice (1)

Applications (12)

Basic Uses (3)

Count Objects (3)

Extract Tracked Objects (1)

Visualize Motion Trajectories (1)

Analyze Wildlife Videos (3)

Analyze Human Videos (1)

See Also

Related Guides

History

Text

CMS

APA

BibTeX

BibLaTeX