SequencePredict
SequencePredict[{seq1,seq2,…}]
generates a SequencePredictorFunction[…] based on the sequences given.
SequencePredict[training,seq]
attempts to predict the next element in the sequence seq from the training sequences given.
SequencePredict[training,{seq1,seq2,…}]
gives predictions for each of the sequences seqi.
SequencePredict["name",seq]
uses the built-in sequence predictor represented by "name".
SequencePredict[…,seq,prop]
give the specified property of the prediction associated with seq.
Details and Options
- The sequences seqi can be lists of either tokens or strings.
- Sequences seqi are assumed to be unordered subsequences of an underlying infinite sequence.
- In SequencePredict[…,seq,prop], properties are as given in SequencePredictorFunction[…]; they include:
-
"NextElement" most likely next element "NextElement"n individually most likely next n elements "NextSequence"n most likely next length-n sequence of elements "RandomNextElement" random sample from the next-element distribution "RandomNextElement"n random sample from the next-sequence distribution "Probabilities" association of probabilities for all possible next elements "SequenceProbability" probability for the predictor to generate the given sequence "SequenceLogProbability" log probability for the predictor to generate the sequence "Properties" list of all properties available - Examples of built-in sequence predictors include:
-
"Chinese" character-based Chinese-language text "English" character-based English-language text "French" character-based French-language text "German" character-based German-language text "Portuguese" character-based Portuguese-language text "Russian" character-based Russian-language text "Spanish" character-based Spanish-language text - The following options can be given:
-
FeatureExtractor Automatic how to preprocess sequences Method Automatic which prediction algorithm to use PerformanceGoal Automatic aspects of performance to try to optimize RandomSeeding 1234 what seeding of pseudorandom generators should be done internally - Typical settings for FeatureExtractor for strings include:
-
"SegmentedCharacters" string interpreted as a sequence of characters (default) "SegmentedWords" string interpreted as a sequence of words - Possible settings for PerformanceGoal include:
-
"Memory" minimize storage requirements of the predictor "Quality" maximize accuracy of the predictor "Speed" maximize speed of the predictor "TrainingSpeed" minimize time spent producing the predictor Automatic automatic tradeoff among speed, accuracy and memory - PerformanceGoal{goal1,goal2,…} will automatically combine goal1, goal2, etc.
- Possible settings for RandomSeeding include:
-
Automatic automatically reseed every time the function is called Inherited use externally seeded random numbers seed use an explicit integer or strings as a seed - Possible settings for Method include:
-
"Markov" Markov model - In SequencePredict[…,Method{"Markov","Order"order}], order corresponds to Markov process memory size.
- In SequencePredict[…,"SequenceProbability"], some probability mass is kept for unknown elements.
- In SequencePredict[training,{},prop], {} is interpreted as an empty list of sequences rather than an empty sequence.
Examples
open allclose allBasic Examples (1)
Train a sequence predictor on a set of sequences:
Predict the next element of a new sequence:
Obtain the probabilities of the next element given the sequence:
Obtain a random next element according to the preceding distribution:
Obtain multiple predictions at a time:
Predict the most likely next element and reuse this intermediate guess to predict the following element:
Scope (4)
Custom Sequence Predictors (3)
Train a sequence predictor on a list of strings:
Predict the next character following a given string:
Predict the next four characters:
Obtain the probabilities for each character to follow the given string:
Train a sequence predictor on the list of common English words, each word treated as a sequence of characters:
Predict the most likely next character from a given sequence:
In the previous example, each word is considered as a subsequence of an infinite sequence. Use the character to mark boundaries between words:
Build a new sequence predictor aware of word boundaries:
Generate the beginning of an English-like word:
Load a book from ExampleData:
Train a sequence predictor on this book:
Sample a random string in the book style:
Train another sequence predictor, interpreting strings as word sequences rather than character sequences:
Complete the preceding string with 10 consecutive words (spaces and punctuation marks are considered as words):
Options (5)
FeatureExtractor (2)
PerformanceGoal (2)
Train a classifier with an emphasis on the resulting model memory footprint:
Compare with the automatically generated model size:
Tune the computation time and precision when exploring the full sequence probability space:
Favor fast and approximated exploration:
Favor more in-depth exploration taking longer computation time:
Text
Wolfram Research (2017), SequencePredict, Wolfram Language function, https://reference.wolfram.com/language/ref/SequencePredict.html (updated 2017).
CMS
Wolfram Language. 2017. "SequencePredict." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2017. https://reference.wolfram.com/language/ref/SequencePredict.html.
APA
Wolfram Language. (2017). SequencePredict. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/SequencePredict.html