SequencePredict

SequencePredict[{seq1,seq2,}]

generates a SequencePredictorFunction[] based on the sequences given.

SequencePredict[training,seq]

attempts to predict the next element in the sequence seq from the training sequences given.

SequencePredict[training,{seq1,seq2,}]

gives predictions for each of the sequences seqi.

SequencePredict["name",seq]

uses the built-in sequence predictor represented by "name".

SequencePredict[,seq,prop]

give the specified property of the prediction associated with seq.

Details and Options

  • The sequences seqi can either be lists of tokens or strings.
  • Sequences seqi are assumed to be unordered subsequences of an underlying infinite sequence.
  • In SequencePredict[,seq,prop], properties are as given in SequencePredictorFunction[]; they include:
  • "NextElement"most likely next element
    "NextElement"nindividually most likely next n elements
    "NextSequence"nmost likely next length-n sequence of elements
    "RandomNextElement"random sample from the next-element distribution
    "RandomNextElement"nrandom sample from the next-sequence distribution
    "Probabilities"association of probabilities for all possible next elements
    "SequenceProbability"probability for the predictor to generate the given sequence
    "SequenceLogProbability"log probability for the predictor to generate the sequence
    "Properties"list of all properties available
  • Examples of built-in sequence predictors include:
  • "Chinese"character-based Chinese-language text
    "English"character-based English-language text
    "French"character-based French-language text
    "German"character-based German-language text
    "Portuguese"character-based Portuguese-language text
    "Russian"character-based Russian-language text
    "Spanish"character-based Spanish-language text
  • The following options can be given:
  • FeatureExtractorAutomatichow to preprocess sequences
    MethodAutomaticwhich prediction algorithm to use
    PerformanceGoalAutomaticaspects of performance to try to optimize
  • Typical settings for FeatureExtractor for strings include:
  • "SegmentedCharacters"string interpreted as a sequence of characters (default)
    "SegmentedWords"string interpreted as a sequence of words
  • Possible settings for PerformanceGoal include:
  • "Memory"minimize storage requirements of the predictor
    "Quality"maximize accuracy of the predictor
    "Speed"maximize speed of the predictor
    "TrainingSpeed"minimize time spent producing the predictor
    Automaticautomatic tradeoff among speed, accuracy and memory
  • PerformanceGoal{goal1,goal2,} will automatically combine goal1, goal2, etc.
  • Possible settings for Method include:
  • "Markov"Markov model
  • In SequencePredict[,Method{"Markov","Order"order}], order corresponds to Markov process memory size.
  • In SequencePredict[,"SequenceProbability"], some probability mass is kept for unknown elements.
  • In SequencePredict[training,{},prop], {} is interpreted as an empty list of sequences rather than an empty sequence.

Examples

open allclose all

Basic Examples  (1)

Train a sequence predictor on a set of sequences:

In[1]:=
Click for copyable input
Out[1]=

Predict the next element of a new sequence:

In[2]:=
Click for copyable input
Out[2]=

Obtain the probabilities of the next element given the sequence:

In[3]:=
Click for copyable input
Out[3]=

Obtain a random next element according to the preceding distribution:

In[4]:=
Click for copyable input
Out[4]=

Obtain multiple predictions at a time:

In[5]:=
Click for copyable input
Out[5]=

Predict the most likely next element and reuse this intermediate guess to predict the following element:

In[6]:=
Click for copyable input
Out[6]=

Predict the most likely following sequence:

In[7]:=
Click for copyable input
Out[7]=

Compare the probabilities for the preceding sequences:

In[8]:=
Click for copyable input
Out[8]=

Scope  (4)

Options  (5)

Possible Issues  (1)

See Also

Predict  Classify  SequencePredictorFunction  TimeSeriesModelFit  TimeSeriesForecast  EstimatedProcess

Introduced in 2017
(11.1)