"Markov" (Machine Learning Method)

Details & Suboptions

  • In a Markov model, at training time, an n-gram language model is computed for each class. At test time, the probability for each class is computed according to Bayes's theorem, , where is given by the language model of the given class and is class prior.
  • The following options can be given:
  • "AdditiveSmoothing" .1the smoothing parameter to use
    "MinimumTokenCount"Automaticminimum count for an n-gram to to be considered
    "Order" Automaticn-gram length
  • When "Order"n, the method partitions sequences in (n+1)-grams.
  • When "Order"0, the method uses unigrams (single tokens). The model can then be called a unigram model or naive Bayes model.
  • The value of "AdditiveSmoothing" is added to all n-gram counts. It is used to regularize the language model.

Examples

open allclose all

Basic Examples  (1)

Train a classifier function on labeled examples:

Obtain information about the classifier:

Classify a new example:

Options  (4)

"AdditiveSmoothing"  (2)

Train a classifier using the "AdditiveSmoothing" suboption:

Train two classifiers on an imbalanced dataset by varying the value of "AdditiveSmoothing":

Look at the corresponding probabilities for the imbalanced element:

"Order"  (2)

Train a classifier by specifying the "Order":

Generate a dataset of real words and random strings:

Generate classifiers using different values for the "Order":

Compare the probabilities of these classifiers on a new real word: