AudioLocalMeasurements

AudioLocalMeasurements[audio,"prop"]

computes the property "prop" locally for partitions of audio.

AudioLocalMeasurements[audio,{"prop1","prop2",}]

computes several properties "propi".

AudioLocalMeasurements[audio,"prop",format]

returns the measurements in the specified output format.

Details and Options

  • AudioLocalMeasurements are also known as audio features or descriptors.
  • AudioLocalMeasurements returns a TimeSeries with measurements returned for each partition.
  • Measurements are computed on the average channel values.
  • Basic histogram properties:
  • "Max"maximum value
    "MaxAbs"maximum absolute value
    "Min"minimum value
    "MinAbs"minimum absolute value
    "MinMax"minimum and maximum values
    "MinMaxAbs"minimum and maximum absolute values
    "Mean"mean value
    "Median"median value
    "StandardDeviation"standard deviation of values
    "Total"sum of values
  • Intensity properties:
  • "Power"mean of the squared values
    "RMSAmplitude"root mean square of the values
    "Loudness"an estimated loudness measure
  • The loudness property uses Stevens's power law, computed using .
  • Time domain properties:
  • "CrestFactor"maximum divided by the root mean square
    "Entropy"entropy of values
    "LPC"linear prediction coefficients
    "PeakToAveragePowerRatio"maximum power divided by the average power
    "TemporalCentroid"temporal centroid of values
    "ZeroCrossingRate"rate of zero crossings
    "ZeroCrossings"number of zero crossings for the partition
  • The "LPC" property returns 12 coefficients that are estimated using linear predictive coding. Using {"LPC",n}, n coefficients are returned.
  • LPC coefficients are commonly used in analysis and the encoding of speech signals.
  • The temporal centroid property gives the center of gravity of the energy of each partition. A temporal centroid of 0.5 means the center of the partition, while 0 and 1 correspond to the beginning and end of the partition.
  • Frequency domain properties:
  • "FundamentalFrequency"estimated fundamental frequency
    "Formants"frequencies of the formants of the signal
    "HighFrequencyContent"average of the linearly weighted power spectrum
    "MFCC"mel-frequency cepstral coefficients
    "SpectralCentroid"centroid of the power spectrum
    "SpectralCrest"maximum divided by the mean of the power spectrum
    "SpectralFlatness"geometric mean divided by the mean of the power spectrum
    "SpectralKurtosis"kurtosis of the magnitude spectrum
    "SpectralRollOff"frequency below which most of the energy is concentrated
    "SpectralSkewness"skewness of the magnitude spectrum
    "SpectralSlope"estimated slope of the magnitude spectrum
    "SpectralSpread"measure of the bandwidth of the power spectrum
  • Using {"FundamentalFrequency",t,minfreq,maxfreq}, only frequencies detected with confidence of t or higher in the frequency range between minfreq and maxfreq are returned. The default values are optimized for signals including speech and instruments.
  • Using {"Formants",n,m}, n formants are returned using m LPC coefficients. By default, and m depends on the input sample rate.
  • The MFCC property returns 13 coefficients. Using {"MFCC",n,m,minfreq,maxfreq}, n coefficients are returned using m filters in the frequency range between minfreq and maxfreq.
  • Frequency domain properties computed on consecutive partitions:
  • "ComplexDomainDistance"distance between predicted and measured Fourier
    "ModifiedKullbackLeibler"modified KullbackLeibler distance between spectra
    "Novelty"estimated measure for significant changes
    "PhaseDeviation"phase difference between predicted and measured Fourier
    "SpectralFlux"norm of the difference between consecutive spectra
  • By default, a list of property values is returned. Other format specifications include:
  • Automaticdetermine the output automatically
    "Association"format the result as an Association
    "Dataset"format the result as a Dataset
    "List"format the result as a List
    "RuleList"format the result as a list of Rule expressions
  • The following options can be given:
  • AlignmentCenteralignment of the time stamps with partitions
    FourierParameters{-1,1}Fourier parameters
    PaddingAutomaticpadding scheme
    PaddingSizeAutomaticamount of padding
    PartitionGranularityAutomaticaudio partitioning specification
    MetaInformationNoneinclude additional meta-information
    MissingDataMethodNonemethod to use for missing values
    ResamplingMethodAutomaticthe method to use for resampling paths
  • By default, measurements are returned at the center of each partition. Using the Alignment option, measurements can be returned at the beginning (Left) or end (Right) of each partition.
  • By default, the signal is padded by half of the partition size at both ends with silence. For possible settings for Padding, see the reference page for AudioPad.

Examples

open allclose all

Basic Examples  (2)

Compute the RMS amplitude of an audio object:

In[3]:=
Click for copyable input
Out[3]=

Plot the measurement:

In[4]:=
Click for copyable input
Out[4]=

Compute multiple measurements:

In[1]:=
Click for copyable input
Out[1]=

Plot the measurements:

In[2]:=
Click for copyable input
Out[2]=

Scope  (20)

Options  (5)

Applications  (2)

Possible Issues  (1)

Neat Examples  (2)

See Also

AudioPartition  AudioBlockMap  AudioMeasurements  AudioLoudness  AudioIntervals  ImageMeasurements

Introduced in 2016
(11.0)
| Updated in 2017
(11.1)