AudioIdentify

AudioIdentify[audio]

yields the result of attempting to identify what audio is a recording of.

AudioIdentify[audio,category]

restricts the identification to the specified category.

AudioIdentify[audio,category,n]

gives a list of up to n possible identifications.

AudioIdentify[audio,category,n,"prop"]

gives the specified property for each identification.

Details and Options

  • Audio identification, also known as audio classification, attempts to identify the sounds in an audio recording.
  • AudioIdentify[{audio1,audio2,},] can be used to identify objects in multiple audio objects.
  • In AudioIdentify[audio,category], possible forms for category include:
  • "class"named sound class, as used in "Sound" entities
    Entity[]any appropriate entity
    category1|category2|any of the categoryi
  • By default, AudioIdentify[audio] returns objects of the form Entity["Sound",].
  • The property "prop" can be one of the following:
  • "Probability"an association of concepts and probabilities
    "Sound"a sound entity object
    "prop"a property supported by "Sound" entities
    {prop1,}a list of property specifications
  • The following options can be given:
  • AcceptanceThresholdAutomaticminimum probability to consider acceptable
    MaskingAllinterval of interest
    PerformanceGoal$PerformanceGoalwhat to optimize in the identification
    SpecificityGoalAutomaticwhat specificity of object type to seek
    TargetDevice"CPU"the target device on which to evaluate
  • Possible settings for PerformanceGoal include "Speed" and "Quality".
  • Possible settings for SpecificityGoal include:
  • "Low"favor general categories of objects
    "High"favor specific kinds of objects
    sspecificity between 0 (lowest) and 1 (highest)
  • When no identification is found at an acceptable level, as specified by RecognitionThreshold, AudioIdentify returns Missing["Unidentified"].
  • AudioIdentify uses machine learning. Its methods, training sets and biases included therein may change and yield varied results in different versions of the Wolfram Language.
  • AudioIdentify may download resources that will be stored in your local object store at $LocalBase and can be listed using LocalObjects[] and removed using ResourceRemove.

Examples

open allclose all

Basic Examples  (2)

Identify the sound in a recording:

Return a list of identifications:

Scope  (3)

Identify the sound class in a recording:

Make an identification in a specific category of sounds:

Make an identification in any of several categories of sounds:

Get several identifications:

Returned identifications are the lesser of the number of positive identifications and the requested number:

Get probabilities for each identification:

Class probabilities are independent, since multiple sources may be present in a single recording.

Options  (4)

AcceptanceThreshold  (2)

Use AcceptanceThreshold to control the confidence of the returned result:

Increase the threshold to get only high-probability identifications:

AcceptanceThreshold is also used when getting multiple identifications:

Lower the threshold to get more results:

Masking  (1)

Audio recordings that contain various sounds can result in confusing identifications:

Use the masking option to identify only specific regions in a signal:

SpecificityGoal  (1)

Use the SpecificityGoal option to control the generality of the result:

Applications  (3)

Identify all the sounds in the ExampleData collection:

Get multiple identifications and probabilities on a signal containing various sounds:

Make identifications over 1-second intervals using AudioBlockMap:

Merge the intervals with the same identifications:

Plot the result:

Using WebAudioSearch, construct a small database of animal sounds:

Use FeatureSpacePlot to visualize the signals embedded in a semantically significant 2D space:

Define a function to identify the signals exclusively as animal sounds:

Generate a WordCloud with the results of the recognitions:

Properties & Relations  (1)

The neural net used by AudioIdentify can be accessed using NetModel:

Introduced in 2019
 (12.0)