FeatureExtract

FeatureExtract[{example1,example2,}]

extracts features for each of the examplei using a feature extractor trained on all the examplei.

FeatureExtract[examples,extractor]

extracts features using the specified feature extractor method.

FeatureExtract[examples,{extractor1,extractor2,}]

extracts features by applying the extractori in sequence.

FeatureExtract[examples,specext]

uses the extractor methods specified by ext on parts of examples specified by spec.

FeatureExtract[examples,{spec1ext1,spec2ext2,}]

uses the extractor methods exti on parts of examples specified by the speci.

Details and Options

  • FeatureExtract can be used on many types of data, including numerical, textual, sounds and images, and combinations of these.
  • Each examplei can be a single data element, a list of data elements, an association of data elements, or a Dataset object.
  • Possible feature extractor methods include:
  • Automaticautomatic extraction
    Identitygive data unchanged
    "ConformedData"conformed images, colors, dates, etc.
    "NumericVector"numeric vector from any data
    fapplies function f to each example
    {extractor1,extractor2,}use a sequence of extractors in turn
  • Additional feature extractor methods can also be used for each data type.
  • Numeric data:
  • "DiscretizedVector"discretized numerical data
    "DimensionReducedVector"reduced-dimension numeric vectors
    "IndicatorVector"nominal data "one-hot encoded" with indicator vectors
    "IntegerVector"nominal data encoded with integers
    "MissingImputed"data with missing values imputed
    "StandardizedVector"numeric data processed with Standardize
  • Nominal data:
  • "IndicatorVector"nominal data "one-hot encoded" with indicator vectors
  • Text:
  • "LowerCasedText"text with each character lowercase
    "SegmentedCharacters"text segmented into characters
    "SegmentedWords"text segmented into words
    "TFIDF"term frequency-inverse document frequency vector
    "WordVectors"semantic vectors sequence from a text (English only)
  • Images:
  • "FaceFeatures"semantic vector from an image of a human face
    "ImageFeatures"semantic vector from an image
    "PixelVector"vector of pixel values from an image
  • Audio objects:
  • "AudioFeatures"sequence of semantic vectors from an audio object
    "LPC"audio linear prediction coefficients
    "MelSpectrogram"audio spectrogram with logarithmic frequencies bins
    "MFCC"audio mel-frequency cepstral coefficients vectors sequence
    "Spectrogram"audio spectrogram
  • Feature extractor methods are applied to data elements with whose types they are compatible. Other data elements are returned unchanged.
  • FeatureExtract[examples] is typically equivalent to FeatureExtract[examples,"NumericVector"].
  • In FeatureExtract[examples,specext] or FeatureExtract[examples,{spec1ext1,}], possible forms for spec and the speci include:
  • Allall parts of each example
    ii^(th) part of each example
    {i1,i2,}parts i1, i2, of each example
    "name"part with the specified name in each example
    {"name1","name2",}parts with names "namei" in each example
  • Parts not mentioned in spec or the speci are dropped for the purpose of extracting features.
  • In FeatureExtract[examples,{spec1ext1,}], the exti are all applied separately to examples.
  • The following options can be given:
  • FeatureNamesAutomaticnames to assign to elements of the examplei
    FeatureTypesAutomaticfeature types to assume for elements of the examplei
    RandomSeeding1234what seeding of pseudorandom generators should be done internally
  • Possible settings for RandomSeeding include:
  • Automaticautomatically reseed every time the function is called
    Inheriteduse externally seeded random numbers
    seeduse an explicit integer or strings as a seed
  • FeatureExtract[] is equivalent to FeatureExtraction[,"ExtractedFeatures"].

Examples

open all close all

Basic Examples  (4)

Extract features from a simple dataset:

In[1]:=
Click for copyable input
Out[1]=

Extract feature from images:

In[1]:=
Click for copyable input
Out[1]=

Standardized numerical values using the "StandardizedVector" extractor method:

In[1]:=
Click for copyable input
Out[1]=

Extract TFIDF vectors on characters by chaining the extractor methods "SegmentedCharacters" and "TFIDF":

In[1]:=
Click for copyable input
Out[1]//MatrixForm=

Scope  (8)

Options  (2)

Applications  (1)

Introduced in 2016
(11.0)
|
Updated in 2017
(11.2)