SpeechRecognize

SpeechRecognize[audio]

recognizes speech in audio and returns it as a string.

Details and Options

  • Speech recognition aims to convert a spoken audio signal to text. It is typically used in voice-enabled human-machine interactions and digital personal assistants.
  • SpeechRecognize[audio] returns all recognized speech in audio as a single string.
  • The following options can be given:
  • MaskingAllinterval of interest
    TargetDevice"CPU"the device on which to perform recognition
  • By default, speech in the whole signal is recognized. Use Masking->{int1,int2,} to limit the recognition to intervals inti.
  • SpeechRecognize only works for English speech.
  • SpeechRecognize uses machine learning, and its training set and methods may change in different versions of the Wolfram Language, yielding different results.
  • SpeechRecognize may download resources that will be stored in your local object store at $LocalBase, and can be listed using LocalObjects[] and removed using ResourceRemove.

Examples

open all close all

Basic Examples  (2)

Recognize speech in an audio signal:

In[1]:=
Click for copyable input
Out[1]=

Recognize speech from a recording:

In[1]:=
Click for copyable input
Out[1]=

Options  (1)

Applications  (4)

Introduced in 2019
(12.0)