SpeechInterpreter

Listing of Interpreter Types »

SpeechInterpreter[form]

represents an interpreter object that can be applied to a speech input to try to interpret it as an object of the specified form.

SpeechInterpreter[form,test]

returns the interpreted object only if applying test to it yields True; otherwise, it returns a Failure object.

SpeechInterpreter[form,test,fail]

returns the result of applying the function fail if the test fails.

Details and Options

  • SpeechInterpreter[][audio] applies the interpreter to a particular audio recording.
  • Possible form specifications include:
  • "SemanticExpression"expression derived semantically from free-form input
    "SemanticNumber"number derived semantically (e.g. "half")
    "SemanticInteger"integer derived semantically (e.g. "six")
    "Boolean"Boolean value (true/false, 1/0, etc. giving True/False)
    "String"pure string
    "TextArea"text of any length (rendered in forms as a text area)
    "TextLine"single line of text
    "SemanticURL"URL derived semantically (e.g. from company name)
    "Date"date in any standard format
    "StructuredDate"date obtained from a picker
    "DateTime"date and time
    "Time"time of day
    "ComputedDate",etc.date derived by computation (e.g. "next tuesday")
    "Location"anything that yields a geo location
    "StreetAddress"any standard street address
    "Country"country or country-like territory
    "AdministrativeDivision"state, province, county, etc.
    "USState"US state
    "USCounty"US county
    "Quantity"quantity with units
    "ComputedQuantity"quantity derived by computation
    "PhysicalQuantity"physical quantity (e.g. "mass")
    "CurrencyAmount"currency amount (e.g. "$7.50")
    "CurrencyName"name of a currency (e.g. "US dollars")
    "Company"company
    "TickerSymbol"financial instrument ticker symbol
    "Color"color in any standard format
    "entity"any Wolfram Language entity type (e.g. "City")
    "entityclass"a class of entities (e.g. "CityClass")
    Restricted[form,spec]a form restricted in the specified way
    DelimitedSequence[form,]a delimited sequence of forms returned as a list
    form1|form2|several possible forms, tried in order
    {c1,c2,}a literal set of choices ci
    {lab1c1,lab2c2,}choices ci with labels labi
    AnySubset[{c1,c2,}]any subset of the ci
    CompoundElement[{form1,}]a list of elements specified by the formi
    CompoundElement[<|key1form1,|>]an association of elements specified by the formi
    RepeatingElement[form,]a list of elements all specified by form
    CloudObject[]a deployed GrammarRules object
    QuantityVariable["pq"]a quantity compatible with the physical quantity pq
  • $InterpreterTypes gives a complete list of possible interpreter types.
  • In the case of "entity", any domain supported by EntityValue can be used.
  • SpeechInterpreter[][audio] returns an interpreted value, or Missing["NoInput"] if no speech is recognized from audio.
  • SpeechInterpreter[choices] allows a list of rules or an association for choices. A pure list of values can also be used when there is no ambiguity.
  • SpeechInterpreter[form,test][input] applies test to the result of interpreting input using the specified form.
  • If the result of applying test is True, then the interpretation of input is returned.
  • If the result of applying test is a Failure object, this object is immediately returned.
  • If the result of applying test is False or anything else, then in SpeechInterpreter[form,test,fail][input] the result of applying fail to the interpretation of input is returned. If no fail is given, then a Failure object is returned.
  • If SpeechInterpreter directly generates a Failure object, the following tags are used:
  • "InterpretationFailure"the string given could not be interpreted in the form specified
    "RestrictionFailure"interpretation succeeded, but a restriction failed
    "ConditionFailure"interpretation and restrictions succeeded, but explicit test failed
    "ConnectionFailure"required cloud connection could not be made
  • SpeechInterpreter supports the following options:
  • AmbiguityFunctionAutomaticfunction to apply to ambiguous semantic results
    GeoLocation$GeoLocationgeo location to assume for semantic interpretation
    MaskingAllinterval of interest
    TargetDevice"CPU"the device on which to perform recognition
    TimeZone$TimeZonetime zone to assume for semantic interpretation
  • SpeechInterpreter[spec][{input1,input2,}] is equivalent to {SpeechInterpreter[spec][input1],SpeechInterpreter[spec][input2],}, except insofar as spec contains constructs such as CompoundElement or RepeatingElement that directly interpret the structure given.
  • SpeechInterpreter[spec][{input1,input2,}] maps interpretation over all inputi, except insofar as spec contains constructs such as CompoundElement or RepeatingElement that directly interpret the structure given.
  • SpeechInterpreter[form][audio1|audio2|] yields as a result the interpretation of the first of the audioi that can be interpreted using the specified form.
  • SpeechInterpreter uses machine learning. Its methods, training sets and biases included therein may change and yield varied results in different versions of the Wolfram Language.
  • SpeechInterpreter may download resources that will be stored in your local object store at $LocalBase, and that can be listed using LocalObjects[] and removed using ResourceRemove.

Examples

open allclose all

Basic Examples  (2)

Interpret a date, generating a DateObject:

Interpret a country, generating an Entity object:

Scope  (19)

Basic Uses  (3)

Interpret an integer:

Perform a test on the interpreted value, returning a Failure if the test result is not True:

Specify a custom failure function to evaluate:

Input Specification  (4)

Interpret integers in a single recording:

Interpret integers in multiple recordings:

Return the interpretation of the first of the recordings that can be interpreted as an integer:

Interpret a compound element and return a list of interpretations:

Form Specification  (5)

Use a single interpretation type:

Interpretation fails if the recording is not of the specified type:

Use a list of alternative interpretation types. The first interpretation that succeeds is returned:

Return one of a literal set of choices matched by the speech in a recording:

Return the value associated with the literal choice matched by the speech in a recording:

Interpreter Types  (7)

Interpret a university, returning an Entity object:

Interpret a location, returning a GeoPosition object:

Interpret input as a currency amount, returning a Quantity object:

Perform a computation on interpreted input:

Interpret many types of entities:

Interpret a spoken free-form expression:

Interpret a sequence of colors:

Options  (1)

Masking  (1)

By default, the entire recording is used for interpretation:

Specify an interval of interest to be interpreted:

Applications  (1)

Interpret an audio recording as an airline:

Properties & Relations  (1)

SpeechInterpreter is effectively calling Interpreter on the result of SpeechRecognize:

Compare with direct speech interpretation:

Introduced in 2020
 (12.1)