AudioIdentify

AudioIdentify[audio]

yields the result of attempting to identify what audio is a recording of.

AudioIdentify[audio,category]

restricts the identification to the specified category.

AudioIdentify[audio,category,n]

gives a list of up to n possible identifications.

AudioIdentify[audio,category,n,"prop"]

gives the specified property for each identification.

Details and Options

  • Audio identification, also known as audio classification, attempts to identify the sounds in an audio recording.
  • AudioIdentify[{audio1,audio2,},] can be used to identify objects in multiple audio objects.
  • In AudioIdentify[audio,category], possible forms for category include:
  • "class"named sound class, as used in "Sound" entities
    Entity[]any appropriate entity
    category1|category2|any of the categoryi
  • By default, AudioIdentify[audio] returns objects of the form Entity["Sound",].
  • The property "prop" can be one of the following:
  • "Probability"an association of concepts and probabilities
    "Sound"a sound entity object
    "prop"a property supported by "Sound" entities
    {prop1,}a list of property specifications
  • The following options can be given:
  • AcceptanceThreshold Automaticminimum probability to consider acceptable
    Masking Allinterval of interest
    PerformanceGoal$PerformanceGoalwhat to optimize in the identification
    SpecificityGoal Automaticwhat specificity of object type to seek
    TargetDevice"CPU"the target device on which to evaluate
  • Possible settings for PerformanceGoal include "Speed" and "Quality".
  • Possible settings for SpecificityGoal include:
  • "Low"favor general categories of objects
    "High"favor specific kinds of objects
    sspecificity between 0 (lowest) and 1 (highest)
  • When no identification is found at an acceptable level, as specified by AcceptanceThreshold, AudioIdentify returns Missing["Unidentified"].
  • AudioIdentify uses machine learning. Its methods, training sets and biases included therein may change and yield varied results in different versions of the Wolfram Language.
  • AudioIdentify may download resources that will be stored in your local object store at $LocalBase and can be listed using LocalObjects[] and removed using ResourceRemove.

Examples

open allclose all

Basic Examples  (2)

Identify the sound in a recording:

Return a list of identifications:

Scope  (3)

Identify the sound class in a recording:

Make an identification in a specific category of sounds:

Make an identification in any of several categories of sounds:

Get several identifications:

Returned identifications are the lesser of the number of positive identifications and the requested number:

Get probabilities for each identification:

Class probabilities are independent, since multiple sources may be present in a single recording.

Options  (4)

AcceptanceThreshold  (2)

Use AcceptanceThreshold to control the confidence of the returned result:

Increase the threshold to get only high-probability identifications:

AcceptanceThreshold is also used when getting multiple identifications:

Lower the threshold to get more results:

Masking  (1)

Audio recordings that contain various sounds can result in confusing identifications:

Use the masking option to identify only specific regions in a signal:

SpecificityGoal  (1)

Use the SpecificityGoal option to control the generality of the result:

Applications  (3)

Identify all the sounds in the ExampleData collection:

Get multiple identifications and probabilities on a signal containing various sounds:

Make identifications over 1-second intervals using AudioBlockMap:

Merge the intervals with the same identifications:

Plot the result:

Using WebAudioSearch, construct a small database of animal sounds:

Use FeatureSpacePlot to visualize the signals embedded in a semantically significant 2D space:

Define a function to identify the signals exclusively as animal sounds:

Generate a WordCloud with the results of the recognitions:

Properties & Relations  (1)

The neural net used by AudioIdentify can be accessed using NetModel:

Wolfram Research (2019), AudioIdentify, Wolfram Language function, https://reference.wolfram.com/language/ref/AudioIdentify.html.

Text

Wolfram Research (2019), AudioIdentify, Wolfram Language function, https://reference.wolfram.com/language/ref/AudioIdentify.html.

CMS

Wolfram Language. 2019. "AudioIdentify." Wolfram Language & System Documentation Center. Wolfram Research. https://reference.wolfram.com/language/ref/AudioIdentify.html.

APA

Wolfram Language. (2019). AudioIdentify. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/AudioIdentify.html

BibTeX

@misc{reference.wolfram_2024_audioidentify, author="Wolfram Research", title="{AudioIdentify}", year="2019", howpublished="\url{https://reference.wolfram.com/language/ref/AudioIdentify.html}", note=[Accessed: 05-November-2024 ]}

BibLaTeX

@online{reference.wolfram_2024_audioidentify, organization={Wolfram Research}, title={AudioIdentify}, year={2019}, url={https://reference.wolfram.com/language/ref/AudioIdentify.html}, note=[Accessed: 05-November-2024 ]}