AudioIntervals

AudioIntervals[audio]

returns audible intervals of audio.

AudioIntervals[audio,crit]

returns intervals of audio for which the criterion crit is satisfied.

AudioIntervals[audio,crit,mindur]

returns only intervals larger than the given duration mindur.

AudioIntervals[video,…]

returns only intervals from the first audio track in video.

Details and Options

AudioIntervals can be used to detect parts of an audio signal that have specific characteristics.
The criteria crit can either be a string specifying a high-level objective or a pure function using local audio properties.
High-level string settings for crit can be one of the following:

	"Audible"	audible intervals, RMS amplitude above 0.01
	"Inaudible"	inaudible intervals, RMS amplitude less than or equal to 0.01
	"Loud"	louder intervals, data-dependent threshold
	"Quiet"	quieter intervals, data-dependent threshold
	"VoiceActivity"	intervals with detected speech
	"VoiceInactivity"	intervals with no detected speech

The criteria crit can also be a function taking #prop arguments and uses the local property "prop" for each partition selection.
The following properties can be used for interval selections.
Basic histogram properties:
"MaxAbs" maximum absolute value

"Max" maximum value

"StandardDeviation" standard deviation of values
Intensity properties:

	"Power"	mean of the squared values
	"RMSAmplitude"	root mean square of the values
	"Loudness"	the loudness using Steven's power law
	"LoudnessEBU"	the loudness according to EBU momentary standard

Time domain properties:

	"CrestFactor"	maximum divided by the root mean square
	"Entropy"	entropy of values
	"PeakToAveragePowerRatio"	maximum power divided by the average power
	"ZeroCrossingRate"	rate of zero crossings
	"ZeroCrossings"	number of zero crossings

Frequency domain properties:

	"FundamentalFrequency"	estimated fundamental frequency
	"ModifiedKullbackLeibler"	modified Kullback–Leibler distance between spectra of consecutive partitions
	"SpectralCentroid"	centroid of the power spectrum
	"SpectralCrest"	maximum divided by the mean of the power spectrum
	"SpectralFlatness"	geometric mean divided by the mean of the power spectrum
	"SpectralKurtosis"	kurtosis of the magnitude spectrum
	"SpectralRollOff"	frequency below which most of the energy is concentrated
	"SpectralSkewness"	skewness of the magnitude spectrum
	"SpectralSlope"	estimated slope of the magnitude spectrum
	"SpectralSpread"	measure of the bandwidth of the power spectrum
	"SpeechFundamentalFrequency"	fundamental frequency optimized for speech signals
	"VoiceActivity"	detected voice activity for speech signals

The minimum duration mindur can be a non-negative real number in seconds, a time quantity, or a samples quantity.
The following options can be given:

Alignment	Automatic	alignment of the time stamps with partitions
FourierParameters	{-1,1}	Fourier parameters
PartitionGranularity	Automatic	audio partitioning specification

By default, measurements are returned at the center of each partition. Using the Alignment option, measurements can be returned at the beginning (Left) or end (Right) of each partition.

Examples

open allclose all

Basic Examples (2)

Compute silent intervals of audio:

Find intervals where the RMS amplitude is less than 0.01:

Visualize silent intervals:

Find intervals with low RMS amplitudes:

Visualize the resulting intervals:

Scope (4)

Find quiet intervals using a data-dependent threshold:

By default, intervals of any length are returned:

Compute the interval durations:

Find only intervals longer than a specified threshold:

Test multiple properties at once:

Analyze the audio track of a video:

Options (2)

PartitionGranularity (2)

Specify a partition size of 100 ms:

Use an offset of 10 ms:

Use a smoothing window:

Using different partitioning specifications will give different results:

A coarse partitioning will result in a faster computation:

Applications (4)

Delete silent intervals of audio:

Find the intervals where the RMS amplitude is larger than a threshold:

Join the extracted intervals:

It is also possible to find silent intervals using a momentary loudness definition from the EBU standard:

Use the "VoiceActivity" property to detect voiced intervals in a speech signal:

Visualize the detected intervals:

Combine other properties such as RMS amplitude and spectral flatness to find unvoiced audio segments:

Visualize the detected intervals:

Detect unvoiced segments and attenuate them:

Use the "VoiceActivity" property to detect unvoiced intervals:

Visualize the detected intervals:

Attenuate the detected intervals:

Possible Issues (1)

The criterion function will fail if the return value is not a Boolean:

Some properties, such as "FundamentalFrequency", can have non-numeric values, so extra care is needed:

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

AudioIntervals

Details and Options

Examples

Basic Examples (2)

Scope (4)

Options (2)

PartitionGranularity (2)

Applications (4)

Possible Issues (1)

Text

CMS

APA

BibTeX

BibLaTeX

	"MaxAbs"	maximum absolute value
	"Max"	maximum value
	"StandardDeviation"	standard deviation of values

AudioIntervals

Details and Options

Examples

Basic Examples (2)

Scope (4)

Options (2)

PartitionGranularity (2)

Applications (4)

Possible Issues (1)

See Also

Related Guides

History

Text

CMS

APA

BibTeX

BibLaTeX