Wolfram Language & System Documentation Center

VideoObjectTracking

VideoObjectTracking[video]

detects objects of interest in video and tracks them over video frames.

VideoObjectTracking[objects]

corresponds to and tracks objects, assuming they are from video frames.

VideoObjectTracking[…detector]

uses detector to find objects of interest in the input.

Details and Options

VideoObjectTracking, also known as visual object tracking, tracks unique objects in frames of a video, trying to handle occlusions if possible. Tracked objects are also known as tracklets.
Tracking could automatically detect objects in frames or be performed on a precomputed set of objects.
The result, returned as an ObjectTrackingData object, includes times, labels and various other properties for each tracklet.
Possible settings for objects and their corresponding outputs are:

	{{pos₁₁,pos₁₂,…},…}	tracking points as kpos_ij
	{{bbox₁₁,bbox₁₂,…},…}	tracking boxes as kbbox_ij
	{label₁{bbox₁₁,bbox₁₂,…},…,…}	tracking boxes as {label_i,j}bbox
	{lmat₁,…}	relabeling segments in label matrices lmat_i
	{t₁obj₁,…}	a list of times and objects

By default, objects are detected using ImageBoundingBoxes. Possible settings for detector include:

	f	a detector function that returns supported objects
	"concept"	named concept, as used in "Concept" entities
	"word"	English word, as used in WordData
	wordspec	word sense specification, as used in WordData
	Entity[…]	any appropriate entity
	category₁\|category₂\|…	any of the category_i
	{pos₁,pos₂,…}	tracks pos₁, … given on first frame throughout the video

Using VideoObjectTracking[{image₁,image₂,…}] is similar to tracking objects across frames of a video.
The following options can be given:
Method Automatic tracking method to use

TargetDevice Automatic the target device on which to perform detection
The possible values for the Method option are:

	"OCSort"	observation-centric SORT (simple, online, real-time) tracking; predicts object trajectories using Kalman estimators
	"RunningBuffer"	offline method, associates objects by comparing a buffer of frames

When tracking label matrices, occlusions are not handled. They can be tracked with Method"RunningBuffer".
With Method->{"OCSort",subopt}, the following suboptions can be specified:

"IOUThreshold"	0.2	intersection over union threshold between bounding boxes
"OcclusionThreshold"	8	number of frames for which history of a tracklet is maintained before expiration
"OCMWeight"	0.2	observation-centric motion weight that accounts for the directionality of moving bounding boxes
"TimeStepORU"	3	length of tracklet history to step back for tracklet re-update

With Method->{"RunningBuffer",subopt}, the following suboptions can be specified:

	"MaxCentroidDistance"	Automatic	maximum distance between the centroids for adjacent frames
	"OcclusionThreshold"	8	number of frames for which the history of a tracklet is maintained before expiration

Additional "RunningBuffer" suboptions to specify the contribution to the cost matrix are:

"CentroidWeight"	0.5	centroid distance between components or bounding boxes
"OverlapWeight"	1	overlap of components or bounding boxes
"SizeWeight"	Automatic	size of components or bounding boxes

Examples

open all close all

Basic Examples (2)

Detect and track objects in a video:

Detect and track faces in a video:

Extract the first frame from each sub-video:

Scope (8)

Data (5)

Detect and track objects in a video:

Detect and track objects in a list of images:

Track a list of bounding boxes:

Track a list of points:

Track components in a time series of label matrices:

Detectors (3)

Automatically detect objects and track them:

Specify a detector function to find objects:

Specify the category of object to detect and track:

Detect and track faces in a video:

Track a list of points of interest in a video:

Options (5)

Method (4)

"OCSort" (3)

In OCSort, motion is predicted using Kalman estimators. Higher values for "OCMWeight" increase the cost when boxes move away from the predicted positions.

Set up a problem with two sets of moving boxes:

By default, direction of trajectories will be very flexible. Notice that in the region of large intersection between blue and red boxes, tracking may change direction suddenly:

Increasing the "WeightOCM" decreases chances of sudden direction changes:

The "IOUThreshold" suboption specifies a threshold for intersection over union between boxes in order to consider them as potentially the same object.

Set up a problem with a set of moving bounding boxes with a gap:

A higher threshold for intersection splits the object trajectory into two parts:

A lower "IOUThreshold" merges the trajectories:

The "OcclusionThreshold" suboption deals with objects that disappear for some time (due to poor detection or occlusion).

Set up a problem with a moving bounding box and remove a couple of frames from the trajectory:

Without an occlusion threshold, the object is not re-associated to the tracklet once it re-emerges:

With a higher occlusion threshold defined, the trajectory is linked back:

"RunningBuffer" (1)

Create trajectory with the object missing (pink boxes) at particular time instances:

Trajectory will not be linked with "OcclusionThreshold"0:

Use a higher occlusion threshold to link object trajectories:

Increase the threshold even more:

TargetDevice (1)

By default, if no detection function is specified, the detection is performed on CPU:

Set the TargetDevice option to "GPU" to perform the detection on GPU:

Applications (13)

Basic Uses (2)

Detect and track objects in a video:

Highlight objects on the video; notice all are labeled with their detected classes:

Track the detected objects:

Highlight tracked detected objects with their corresponding indices:

Track labeled components from matrices:

Define a segmentation function that works on each frame:

Segment video frames and show components:

Track the components across frames and show tracked components:

Count Objects (3)

Count the number of detected objects in a video:

Track objects and find unique instances:

Get the final counts:

Count occurrences of a specific object:

Track objects and find unique instances:

Get the final counts:

Count the number of elephants in a video:

Extract Tracked Objects (1)

Detect and track the contents of a video:

Extract the first of the detected labels:

Extract the sub-video corresponding to the first tracked object:

Visualize Motion Trajectories (1)

Track pedestrians in a railway station:

Detect the bounding boxes and show them over the original video:

Track the boxes:

Plot the trajectories of the centroids of the boxes:

Overlay the trajectories onto the original video:

Analyze Wildlife Videos (4)

Track a herd of migrating elephants:

Highlight frames with the tracked elephants:

Track a herd of galloping horses:

Track a flock of sheep entering a barn:

Track a select number of bees in a hive:

Start from a set of initial points marked on the first frame:

Track the points throughout the video frames:

Highlight the tracked bees on the video:

Analyze Human Videos (2)

Estimate age from the face of each person in a video:

Detect and track faces:

Find tracked faces with longest duration in the video:

Construct a timeseries of selected labels:

Compute estimated age for each tracked face:

Compute median estimated age for each face:

Track women dancing on the stage:

Extract video of one of the dancers:

Determine the number of taps the performer makes:

Find the number of peaks that correlate with the jumps/taps:

Properties & Relations (1)

By default, ImageBoundingBoxes is used to detect objects. Use the YOLO V8 network from the Wolfram Neural Net Repository to perform the detection:

Retrieve the network and its evaluation function:

Detect and track the object using the YOLO V8 network:

Highlight the tracked detected objects:

Neat Examples (1)

Track the motion of particles undergoing a random walk:

Extract the centroids of particles from the video:

Track the particles and extract the trajectories:

Plot the trajectories of all the particles:

Visualize the motion of the particles with the longest trajectories:

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

VideoObjectTracking

Details and Options

Examples

Basic Examples (2)

Scope (8)

Data (5)

Detectors (3)

Options (5)

Method (4)

"OCSort" (3)

"RunningBuffer" (1)

TargetDevice (1)

Applications (13)

Basic Uses (2)

Count Objects (3)

Extract Tracked Objects (1)

Visualize Motion Trajectories (1)

Analyze Wildlife Videos (4)

Analyze Human Videos (2)

Properties & Relations (1)

Neat Examples (1)

Text

CMS

APA

BibTeX

BibLaTeX

VideoObjectTracking

Details and Options

Examples

Basic Examples (2)

Scope (8)

Data (5)

Detectors (3)

Options (5)

Method (4)

"OCSort" (3)

"RunningBuffer" (1)

TargetDevice (1)

Applications (13)

Basic Uses (2)

Count Objects (3)

Extract Tracked Objects (1)

Visualize Motion Trajectories (1)

Analyze Wildlife Videos (4)

Analyze Human Videos (2)

Properties & Relations (1)

Neat Examples (1)

See Also

Related Guides

History

Text

CMS

APA

BibTeX

BibLaTeX