AudioIdentify
AudioIdentify[audio]
yields the result of attempting to identify what audio is a recording of.
AudioIdentify[audio,category]
restricts the identification to the specified category.
AudioIdentify[audio,category,n]
gives a list of up to n possible identifications.
AudioIdentify[audio,category,n,"prop"]
gives the specified property for each identification.
Details and Options
- Audio identification, also known as audio classification, attempts to identify the sounds in an audio recording.
- AudioIdentify[{audio1,audio2,…},…] can be used to identify objects in multiple audio objects.
- In AudioIdentify[audio,category], possible forms for category include:
-
"class" named sound class, as used in "Sound" entities Entity[…] any appropriate entity category1category2… any of the categoryi - By default, AudioIdentify[audio] returns objects of the form Entity["Sound",…].
- The property "prop" can be one of the following:
-
"Probability" an association of concepts and probabilities "Sound" a sound entity object "prop" a property supported by "Sound" entities {prop1,…} a list of property specifications - The following options can be given:
-
AcceptanceThreshold Automatic minimum probability to consider acceptable Masking All interval of interest PerformanceGoal $PerformanceGoal what to optimize in the identification SpecificityGoal Automatic what specificity of object type to seek TargetDevice "CPU" the target device on which to evaluate - Possible settings for PerformanceGoal include "Speed" and "Quality".
- Possible settings for SpecificityGoal include:
-
"Low" favor general categories of objects "High" favor specific kinds of objects s specificity between 0 (lowest) and 1 (highest) - When no identification is found at an acceptable level, as specified by AcceptanceThreshold, AudioIdentify returns Missing["Unidentified"].
- AudioIdentify uses machine learning. Its methods, training sets and biases included therein may change and yield varied results in different versions of the Wolfram Language.
- AudioIdentify may download resources that will be stored in your local object store at $LocalBase and can be listed using LocalObjects[] and removed using ResourceRemove.
Examples
open allclose allScope (3)
Identify the sound class in a recording:
Make an identification in a specific category of sounds:
Make an identification in any of several categories of sounds:
Returned identifications are the lesser of the number of positive identifications and the requested number:
Get probabilities for each identification:
Class probabilities are independent, since multiple sources may be present in a single recording.
Options (4)
AcceptanceThreshold (2)
Use AcceptanceThreshold to control the confidence of the returned result:
Increase the threshold to get only high-probability identifications:
AcceptanceThreshold is also used when getting multiple identifications:
Masking (1)
SpecificityGoal (1)
Use the SpecificityGoal option to control the generality of the result:
Applications (3)
Identify all the sounds in the ExampleData collection:
Get multiple identifications and probabilities on a signal containing various sounds:
Make identifications over 1-second intervals using AudioBlockMap:
Merge the intervals with the same identifications:
Using WebAudioSearch, construct a small database of animal sounds:
Use FeatureSpacePlot to visualize the signals embedded in a semantically significant 2D space:
Define a function to identify the signals exclusively as animal sounds:
Generate a WordCloud with the results of the recognitions:
Properties & Relations (1)
The neural net used by AudioIdentify can be accessed using NetModel:
Text
Wolfram Research (2019), AudioIdentify, Wolfram Language function, https://reference.wolfram.com/language/ref/AudioIdentify.html.
CMS
Wolfram Language. 2019. "AudioIdentify." Wolfram Language & System Documentation Center. Wolfram Research. https://reference.wolfram.com/language/ref/AudioIdentify.html.
APA
Wolfram Language. (2019). AudioIdentify. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/AudioIdentify.html