returns a distance measure between audio1 and audio2.

Details and Options

  • AudioDistance computes a dissimilarity measure between audio objects that may compare waveforms or other features of the signals, using different distance functions.
  • If audio1 and audio2 are of different durations, the distance is computed on the trimmed signals to the shorter duration by default.
  • The following options can be specified:
  • DistanceFunctionAutomaticthe distance function to use
    MaskingAutomaticthe audio intervals to use for comparison
    PartitionGranularityAutomaticaudio partitioning specification
    SampleRateAutomaticsample rate for conforming audioi
  • By default, using DistanceFunction->Automatic, the EuclideanDistance of audio waveforms is computed. Compute other measures using different distance functions or different features.
  • The following distance functions are computed from the Fourier transform of audioi:
  • "SpectralEuclidean"Euclidean applied to the power spectra (default)
    "SpectralItakuraSaito"maximum likelihood of LPC-derived spectral envelopes
    "SpectralMagnitudePhaseDistortion"the average of magnitude and phase spectral distances
    "SpectralRMSLog"Euclidean applied to the log of power spectra
    "SpectralFirstOrderDifferential"distance between first-order derivatives of power spectra
    "SpectralSecondOrderDifferential"distance between second-order derivatives of power spectra
    "Cepstral"Euclidean applied to the power cepstra
  • Additional DistanceFunction settings are also available and can work on different audio features:
  • EuclideanDistanceEuclidean distance
    SquaredEuclideanDistancesquared Euclidean distance
    NormalizedSquaredEuclideanDistancenormalized squared Euclidean distance
    RootMeanSquareroot mean square distance
    ManhattanDistanceManhattan or "city block" distance
    CosineDistanceangular cosine distance
    CorrelationDistancecorrelation coefficient distance
    WarpingDistancedynamic time warping (DTW) distance
    fan arbitrary function f
  • By default, WarpingDistance is computed from the "MFCC" features and all other distances are computed from "AudioData".
  • Using DistanceFunction->{method,FeatureExtractor->f}, a different feature extractor can be specified.
  • Possible settings for FeatureExtractor include:
  • "AudioData"audio data
    "Formants"frequencies of the formants of the signal
    "LPC"linear prediction coefficients
    "MelSpectrogram"mel-scale audio spectrogram
    "MFCC"mel-frequency cepstral coefficients vectors sequence
    "Novelty"estimated measure for significant changes
  • By default, AudioDistance is computed on the trimmed signals to the shorter duration.
  • Use the Masking option to compute the distance measure on different intervals. Possible settings include:
  • Automatictrim to the shorter duration (default)
    Allpad to the longer duration
    {t1,t2}compare the signals between times t1 and t2
    {{t11,t12},{t21,t22}}t11 to t12 from audio1 compared to t21 to t22 from audio2
  • Using Masking->{{t22,t12}},{t21,t22}}, the duration of the two intervals should be the same.
  • PartitionGranularity is only used with features that work on partitioned audio, like "MFCC", and ignored otherwise.
  • By default, SampleRate->Automatic takes the highest sample rate in all audioi.


open all close all

Basic Examples  (1)

Distance between two audio objects:

Click for copyable input

Scope  (1)

Options  (13)

Applications  (1)

Introduced in 2018