AudioIntervals can be used to detect parts of an audio signal that have specific characteristics.
The criteria crit can either be a string specifying a high-level objective or a pure function using local audio properties.
High-level string settings for crit can be one of the following:

	"Audible"	audible intervals, RMS amplitude above 0.01
	"Inaudible"	inaudible intervals, RMS amplitude less than or equal to 0.01
	"Loud"	louder intervals, data-dependent threshold
	"Quiet"	quieter intervals, data-dependent threshold
	"VoiceActivity"	intervals with detected speech
	"VoiceInactivity"	intervals with no detected speech

The criteria crit can also be a function taking #prop arguments and uses the local property "prop" for each partition selection.
The following properties can be used for interval selections.
Basic histogram properties:
"MaxAbs" maximum absolute value

"Max" maximum value

"StandardDeviation" standard deviation of values
Intensity properties:

	"Power"	mean of the squared values
	"RMSAmplitude"	root mean square of the values
	"Loudness"	the loudness using Steven's power law
	"LoudnessEBU"	the loudness according to EBU momentary standard

Time domain properties:

	"CrestFactor"	maximum divided by the root mean square
	"Entropy"	entropy of values
	"PeakToAveragePowerRatio"	maximum power divided by the average power
	"ZeroCrossingRate"	rate of zero crossings
	"ZeroCrossings"	number of zero crossings

Frequency domain properties:

	"FundamentalFrequency"	estimated fundamental frequency
	"ModifiedKullbackLeibler"	modified Kullback–Leibler distance between spectra of consecutive partitions
	"SpectralCentroid"	centroid of the power spectrum
	"SpectralCrest"	maximum divided by the mean of the power spectrum
	"SpectralFlatness"	geometric mean divided by the mean of the power spectrum
	"SpectralKurtosis"	kurtosis of the magnitude spectrum
	"SpectralRollOff"	frequency below which most of the energy is concentrated
	"SpectralSkewness"	skewness of the magnitude spectrum
	"SpectralSlope"	estimated slope of the magnitude spectrum
	"SpectralSpread"	measure of the bandwidth of the power spectrum
	"SpeechFundamentalFrequency"	fundamental frequency optimized for speech signals
	"VoiceActivity"	detected voice activity for speech signals

The minimum duration mindur can be a non-negative real number in seconds, a time quantity, or a samples quantity.
The following options can be given:

Alignment	Automatic	alignment of the time stamps with partitions
FourierParameters	{-1,1}	Fourier parameters
PartitionGranularity	Automatic	audio partitioning specification

By default, measurements are returned at the center of each partition. Using the Alignment option, measurements can be returned at the beginning (Left) or end (Right) of each partition.

Examples

open all close all

Basic Examples (2)

Compute silent intervals of audio:

Wolfram Language code: a = ExampleData[{"Sound", "AltoFluteScale"}, "Audio"]

Find intervals where the RMS amplitude is less than 0.01:

Wolfram Language code: int = AudioIntervals[a, "Inaudible"]

Visualize silent intervals:

Wolfram Language code: AudioPlot[a, Epilog -> {RGBColor[1, 0, 0, .5], Rectangle[{#[[1]], -1}, {#[[2]], 1}]& /@ int}]

Find intervals with low RMS amplitudes:

Wolfram Language code: a = ExampleData[{"Sound", "AltoFluteScale"}, "Audio"]

Wolfram Language code: int = AudioIntervals[a, #RMSAmplitude < .01&]

Visualize the resulting intervals:

Wolfram Language code: AudioPlot[a, Epilog -> {RGBColor[1, 0, 0, .5], Rectangle[{#[[1]], -1}, {#[[2]], 1}]& /@ int}]

Scope (4)

Find quiet intervals using a data-dependent threshold:

Wolfram Language code:

a = Import["ExampleData/rule30.wav"];
AudioIntervals[a, "Quiet"]

Wolfram Language code: AudioPlot[a, Epilog -> {RGBColor[1, 0, 0, .5], Rectangle[{#[[1]], -1}, {#[[2]], 1}]& /@ %}]

By default, intervals of any length are returned:

Wolfram Language code: a = ExampleData[{"Sound", "AltoFluteScale"}, "Audio"]

Wolfram Language code: int = AudioIntervals[a, #RMSAmplitude < .01&]

Compute the interval durations:

Wolfram Language code: Differences /@ int//Flatten

Find only intervals longer than a specified threshold:

Wolfram Language code: AudioIntervals[a, #RMSAmplitude < .01&, 0.2]

Test multiple properties at once:

Wolfram Language code: a = ExampleData[{"Audio", "CelloScale"}, "Audio"];

Wolfram Language code: AudioIntervals[a, #RMSAmplitude > .03 && (#SpectralCentroid < 500 || #SpectralCentroid > 800)&]

Analyze the audio track of a video:

Wolfram Language code: AudioIntervals[\!\(\*VideoBox["![Video Player: ExampleData/fish.mp4](video://content-2sfji)"]\)]

Options (2)

PartitionGranularity (2)

Specify a partition size of 100 ms:

Wolfram Language code: a = Import["ExampleData/rule30.wav"];

Wolfram Language code: AudioIntervals[a, #Power > .001&, PartitionGranularity -> Quantity[100, "Milliseconds"]]

Use an offset of 10 ms:

Wolfram Language code: AudioIntervals[a, #Power > .001&, PartitionGranularity -> {Quantity[100, "Milliseconds"], Quantity[10, "Milliseconds"]}]

Use a smoothing window:

Wolfram Language code:

AudioIntervals[a, #Power > .001&, PartitionGranularity -> {Quantity[100, "Milliseconds"], Quantity[10, "Milliseconds"], HannWindow}]

Using different partitioning specifications will give different results:

Wolfram Language code: a = ExampleData[{"Sound", "Viola"}, "Audio"];

Wolfram Language code: AudioIntervals[a, #RMSAmplitude < .01&, PartitionGranularity -> {.005, .001}]

A coarse partitioning will result in a faster computation:

Wolfram Language code: AudioIntervals[a, #RMSAmplitude < .01&, PartitionGranularity -> {.1, .1}]

Applications (4)

Delete silent intervals of audio:

Wolfram Language code: a = ExampleData[{"Sound", "Apollo11SmallStep"}, "Audio"]

Find the intervals where the RMS amplitude is larger than a threshold:

Wolfram Language code: silentIntervals = AudioIntervals[a, #RMSAmplitude < .04&, 0.001]

Join the extracted intervals:

Wolfram Language code: AudioDelete[a, silentIntervals]

Wolfram Language code: AudioPlot[{a, %}]

It is also possible to find silent intervals using a momentary loudness definition from the EBU standard:

Wolfram Language code: silentIntervals = AudioIntervals[a, #LoudnessEBU < -23&, 0.001]

Wolfram Language code: AudioDelete[a, silentIntervals]

Wolfram Language code: AudioPlot[{a, %}]

Use the "VoiceActivity" property to detect voiced intervals in a speech signal:

Wolfram Language code: a = ExampleData[{"Audio", "MaleVoice"}, "Audio"]

Wolfram Language code: voiced = AudioIntervals[a, #VoiceActivity == 1&, .1, PartitionGranularity -> {.06, .01}]

Visualize the detected intervals:

Wolfram Language code:

AudioPlot[a, PlotLayout -> "Averaged", Epilog -> {RGBColor[1, 0, 0, 0.3], Rectangle[{#[[1]], -1}, {#[[2]], 1}]& /@ voiced}]

Combine other properties such as RMS amplitude and spectral flatness to find unvoiced audio segments:

Wolfram Language code: a = ExampleData[{"Audio", "NoisyTalk"}, "Audio"]

Wolfram Language code: unvoiced = AudioIntervals[a, #RMSAmplitude < .03 && #SpectralFlatness > .0001&, .1, PartitionGranularity -> {.06, .01}]

Visualize the detected intervals:

Wolfram Language code: AudioPlot[a, Epilog -> {RGBColor[1, 0, 0, .3], Rectangle[{#[[1]], -1}, {#[[2]], 1}]& /@ unvoiced}, ImageSize -> Medium]

Detect unvoiced segments and attenuate them:

Wolfram Language code: a = ExampleData[{"Audio", "NoisyTalk"}, "Audio"]

Use the "VoiceActivity" property to detect unvoiced intervals:

Wolfram Language code: nonVoicedIntervals = AudioIntervals[a, #VoiceActivity == 0&, .1, PartitionGranularity -> {.02, .01}]

Visualize the detected intervals:

Wolfram Language code:

AudioPlot[a, Epilog -> {RGBColor[1, 0, 0, .3], Rectangle[{#[[1]], -1}, {#[[2]], 1}]& /@ nonVoicedIntervals}, ImageSize -> Medium]

Attenuate the detected intervals:

Wolfram Language code: AudioJoin[Riffle[AudioFade /@ AudioTrim[a, Except@nonVoicedIntervals], 0.3×AudioTrim[a, nonVoicedIntervals]]]

Possible Issues (1)

The criterion function will fail if the return value is not a Boolean:

Wolfram Language code:

a = ExampleData[{"Audio", "PianoScale"}, "Audio"];
AudioIntervals[a, Red&]//Head

Some properties, such as "FundamentalFrequency", can have non-numeric values, so extra care is needed:

Wolfram Language code: AudioIntervals[a, #FundamentalFrequency > 260&]//Head

Wolfram Language code: AudioIntervals[a, TrueQ[#FundamentalFrequency > 260]&]

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

AudioIntervals

Details and Options

Examples

Basic Examples (2)

Scope (4)

Options (2)

PartitionGranularity (2)

Applications (4)

Possible Issues (1)

Text

CMS

APA

BibTeX

BibLaTeX

AudioIntervals

Details and Options

Examples

Basic Examples (2)

Scope (4)

Options (2)

PartitionGranularity (2)

Applications (4)

Possible Issues (1)

See Also

Related Guides

History

Text

CMS

APA

BibTeX

BibLaTeX