FeatureTypes

FeatureTypes

is an option for machine learning functions such as Classify or Predict that specifies what feature types to assume for elements of input data given.

Details

  • Possible settings for FeatureTypes include:
  • Automaticautomatically detect types of all features
    typeinterpret the unique feature as type
    {t1,t2,}interpret the i^(th) feature as type ti
    <|iti,jtj,|>interpret the i^(th) feature as type ti etc.
    <|{i,j,}t,|>interpret the i^(th), j^(th), etc. features as type t
    <|"n1"t1,"n2"t2,|>interpret the feature named "ni" as type ti
    <|{"n1","n2",}t,|>interpret the features named "n1", "n2", etc. as type t
  • Possible feature types include:
  • Automaticautomatically detected type
    "Audio"acoustic signal
    "Boolean"Boolean value
    "BooleanTensor"fixed-dimension array of Boolean values
    "BooleanVector"fixed-length vector of Boolean values
    "Color"color
    "Complex"complex value
    "ComplexTensor"fixed-dimension array of complex values
    "ComplexVector"fixed-length vector of complex values
    "Date"date as a string or DateObject
    "Image"2D image
    "Image3D"3D image
    "Nominal"discrete value specified by a name
    "NominalBag"collection of nominal values
    "NominalSequence"ordered collection of nominal values
    "NominalTensor"fixed-dimension array of nominal values
    "NominalVector"fixed-length vector of nominal values
    "Numerical"continuous numerical real value
    "NumericalBag"collection of numerical values
    "NumericalSequence"ordered collection of numerical values
    "NumericalTensor"fixed-dimension array of numerical values
    "NumericalVector"fixed-length vector of numerical values
    "NumericalVectorSequence"sequence of numerical vectors
    "NumericalTensorSequence"sequence of numerical tensors
    "Text"natural language string
    "Time"time as a string or TimeObject
  • When the type of a feature is not specified, or is specified as Missing[], it is considered as Automatic.
  • The value of option FeatureTypes supersedes the value of option NominalVariables, except when FeatureTypesAutomatic.

Examples

Basic Examples  (4)

Train a predictor without specifying feature types:

In[1]:=
Click for copyable input
In[2]:=
Click for copyable input
Out[2]=

The features are assumed to be numerical:

In[3]:=
Click for copyable input
Out[3]=
In[4]:=
Click for copyable input
Out[4]=

Specify that the first feature should be interpreted as a nominal variable, while the type of the second should be determined automatically:

In[5]:=
Click for copyable input
Out[5]=
In[6]:=
Click for copyable input
Out[6]=
In[7]:=
Click for copyable input
Out[7]=

Train a classifier on data where the feature is intended to be a sequence of tokens:

In[1]:=
Click for copyable input
Out[1]=

Classify wrongly assumed that examples contained two different text features:

In[2]:=
Click for copyable input
Out[2]=

The following classification will output an error message:

In[3]:=
Click for copyable input
Out[3]=

Force Classify to interpret the feature as a "NominalSequence":

In[4]:=
Click for copyable input
Out[4]=
In[5]:=
Click for copyable input
Out[5]=

Classify a new example:

In[6]:=
Click for copyable input
Out[6]=

Train a predictor on nominal data:

In[1]:=
Click for copyable input
In[2]:=
Click for copyable input
Out[2]=

The feature has been wrongly interpreted as text:

In[3]:=
Click for copyable input
Out[3]=

Specify that the feature should be considered nominal:

In[4]:=
Click for copyable input
Out[4]=
In[5]:=
Click for copyable input
Out[5]=

Predict an example:

In[6]:=
Click for copyable input
Out[6]=

Train a classifier with named features:

In[1]:=
Click for copyable input
In[2]:=
Click for copyable input
Out[2]=

Both features have been considered numerical:

In[3]:=
Click for copyable input
Out[3]=

Specify that the feature "gender" should be considered nominal:

In[4]:=
Click for copyable input
Out[4]=
In[5]:=
Click for copyable input
Out[5]=

See Also

FeatureNames  NominalVariables  Classify  Predict

Introduced in 2015
(10.1)
| Updated in 2017
(11.2)