FeatureExtraction[examples]
generates a FeatureExtractorFunction[…] trained from the examples given.
FeatureExtraction[examples,spec]
uses the specified feature extractor method spec.
FeatureExtraction[examples,spec,props]
gives the feature extraction properties specified by props.


FeatureExtraction
FeatureExtraction[examples]
generates a FeatureExtractorFunction[…] trained from the examples given.
FeatureExtraction[examples,spec]
uses the specified feature extractor method spec.
FeatureExtraction[examples,spec,props]
gives the feature extraction properties specified by props.
Details and Options








- FeatureExtraction is typically used to define a function that processes raw data into usable features (e.g. for training a machine learning algorithm).
- FeatureExtraction can be used on many types of data, including numerical, textual, sounds, images, graphs and time series, as well as combinations of these.
- Possible values of examples are:
-
{example1,…} a list of training examples Dataset[…] a Dataset object Tabular[…] a Tabular object None no training examples - Each examplei can be a single data element, a list of data elements or an association of data elements.
- Possible values for spec include:
-
extractor use the specified extractor method partextractor apply the extractor to the specific example part {part1extractor1,…} specify extractors for specific parts - Possible feature extractor methods extractor include:
-
Automatic automatic extraction Identity give data unchanged "ConformedData" conformed images, colors, dates, etc. "NumericVector" numeric vector from any data "name" a named extractor method f applies function f to each example {extractor1,extractor2,…} use a sequence of extractors in turn - Possible forms of part are:
-
All all parts of each example i i part of each example
{i1,i2,…} parts i1, i2, … of each example "key" part with the specified key in each example {"key1","key2",…} parts with names "keyi" in each example - When explicitly specifying parts, any unmentioned parts are dropped when extracting features.
- FeatureExtraction[examples] is equivalent to FeatureExtraction[examples,Automatic], which is typically equivalent to FeatureExtraction[examples,"NumericVector"].
- The "NumericVector" method will typically convert examples to numeric vectors, impute missing data and reduce the dimension using DimensionReduction.
- Feature extractor methods specific to a single data type are applied only to data elements with whose types they are compatible. Other data elements are returned unchanged.
- Not all specific feature extractors are available when the examples is None.
- The specific extractors are:
- Numeric data:
-
"DiscretizedVector" discretized numerical data "DimensionReducedVector" reduced-dimension numeric vectors "MissingImputed" data with missing values imputed "StandardizedVector" numeric data processed with Standardize - Nominal data:
-
"IndicatorVector" nominal data "one-hot encoded" with indicator vectors "IntegerVector" nominal data encoded with integers - Text:
-
"LowerCasedText" text with each character lowercase "SegmentedCharacters" text segmented into characters "SegmentedWords" text segmented into words "SentenceVector" semantic vector from a text "TFIDF" term frequency-inverse document frequency vector "WordVectors" semantic vectors sequence from a text (English only) - Images:
-
"FaceFeatures" semantic vector from an image of a human face "ImageFeatures" semantic vector from an image "PixelVector" vector of pixel values from an image - Audio objects:
-
"AudioFeatures" sequence of semantic vectors from an audio object "AudioFeatureVector" semantic vector from an audio object "LPC" audio linear prediction coefficients "MelSpectrogram" audio spectrogram with logarithmic frequencies bins "MFCC" audio mel-frequency cepstral coefficients vectors sequence "SpeakerFeatures" sequence of semantic speaker vectors "SpeakerFeatureVector" semantic vector for a speaker "Spectrogram" audio spectrogram - Video objects:
-
"VideoFeatures" sequence of semantic vectors from a video object "VideoFeatureVector" semantic vector from a video object - Graphs:
-
"GraphFeatures" numeric vector summarizing graph properties - Molecules:
-
"AtomPairs" Boolean vector from pairs of atoms and the path lengths between them "MoleculeExtendedConnectivity" Boolean vector from enumerated molecule subgraphs "MoleculeFeatures" numeric vector summarizing molecule properties "MoleculeTopologicalFeatures" Boolean vector from circular atom neighborhoods - In FeatureExtraction[examples,extractors,props], props can be a single property or a list of properties. Possible properties include:
-
"ExtractorFunction" FeatureExtractorFunction[…] (default) "ExtractedFeatures" examples after feature extraction "ReconstructedData" examples after extraction and inverse extraction "FeatureDistance" FeatureDistance[…] generated from the extractor - The "ExtractedFeatures" and "ReconstructedData" properties are not available when examples is None.
- The "ReconstructedData" property can be computed only when every specified extractor is invertible.
- The following options can be given:
-
FeatureNames Automatic names to assign to elements of the examplei FeatureTypes Automatic feature types to assume for elements of the examplei RandomSeeding 1234 what seeding of pseudorandom generators should be done internally - Possible settings for RandomSeeding include:
-
Automatic automatically reseed every time the function is called Inherited use externally seeded random numbers seed use an explicit integer or strings as a seed
Extractors
Properties
Options
Examples
open all close allBasic Examples (3)
Train a FeatureExtractorFunction on a simple dataset:
Extract features from a new example:
Extract features from a list of examples:
Train a feature extractor on a dataset of images:
Scope (32)
Input Shape (9)
Train a feature extractor on a list of examples with a single feature:
Extract features from a new example:
Extract features from multiple new examples:
Train a feature extractor on a list of examples with multiple features:
Extract features from multiple new examples:
Train a feature extractor on a mixed-type dataset:
Extract features from a new example:
Train a feature extractor from a list of associations:
Extract features from a new example:
Extract features from multiple new examples:
Train a feature extractor from data given as feature lists:
Train a feature extractor from a Tabular:
Train a feature extractor from a Dataset:
Train a feature extractor from a dataset that contains missing values:
Extractor Specifications (10)
Specify the feature extractor "SentenceVector" on a single textual feature:
Train a feature extractor using the "StandardizedVector" method:
Extract features from a new example:
Since this feature extractor is invertible, the FeatureExtractorFunction property "OriginalData" can be used to perform the inverse extraction:
Train a feature extractor on text using the "TFIDF" method followed by the "DimensionReducedVector" method:
Extract features on the training set:
Train a feature extractor on texts and images using the text-only "TFIDF" method:
Features will only be extracted from the text part:
Specify the feature extraction on multiple features by position:
Use the feature extractor on new features:
A list of two items will be assumed to be a single input of two features:
Train a feature extractor with the "IndicatorVector" method on only the second nominal variable:
The first nominal variable is dropped:
Use the Identity extractor method to copy the first variable:
A variable can be copied multiple times:
Specify the feature extraction on multiple features by key:
Use the feature extractor on new features:
Using the feature extractor on a list will assume the same ordering of features as originally specified:
Generate a feature extractor using a custom function:
Apply the extractor on the training set:
Chain the custom extractor with the "StandardizedVector" method:
Feature Types (10)
Create a feature extractor for textual data using the "SentenceVector" extractor with no training:
Input type is inferred from the specified extractor. Use the feature extractor on some examples:
Create a feature extractor for examples with implicit textual and image features:
Features will be extracted from both parts:
Train a feature extractor on textual data:
Train a feature extractor with the "IndicatorVector" method on nominal variables:
Train a feature extractor to compute term frequency-inverse document frequency vectors from texts:
The term frequency-inverse document frequency matrix of the training set can be computed in a SparseArray:
The "TFIDF" method can also be used on tokenized data (nominal bags):
Train a feature extractor on a list of DateObject instances:
Extract features from a new DateObject:
A string date can also be given:
Train a feature extractor on a list of Graph instances:
Extract features from a new graph:
Train a feature extractor on a list of TimeSeries instances:
Train a feature extractor on Molecule data:
Train a feature extractor on a list of Audio instances:
Information (3)
Get Information from a trained FeatureExtractorFunction:
Options (4)
FeatureNames (2)
Train a feature extractor and give a name to each feature:
Use the association format to extract features from a new example:
The list format can still be used:
Use FeatureNames to set up names and refer to them in FeatureExtraction[examples,{spec1ext1,…}]:
Extract features on a new example using the names to specify the features:
FeatureTypes (2)
Train a feature extractor with the "IndicatorVector" method on a simple dataset:
The first feature has been interpreted as numerical. Since the "IndicatorVector" method only acts on nominal features, the first feature is unchanged:
Use FeatureTypes to enforce the interpretation of the first feature as nominal:
Now both features are encoded as indicator vectors:
Creating a feature extractor with no training infers the expected data type from the specific extractor:
Applications (3)
Image Search (1)
Construct a dataset of dog images:
Train an extractor function from this dataset:
Generate a NearestFunction on the extracted features of the dataset:
Using the NearestFunction, construct a function that displays the nearest image of the dataset:
Use this function on images that are not in the dataset:
This feature extractor function can also be used to delete image pairs that are too similar:
Text Search (1)
Load the text of Alice in Wonderland:
Split the text into sentences:
Train a feature extractor on these sentences:
Generate a NearestFunction with the sentences' features:
Using the NearestFunction, construct a function that displays the nearest sentence in Alice in Wonderland:
Imputation (1)
Load the "MNIST" dataset from ExampleData and keep the images:
Convert images to numerical data and separate the dataset into a training set and a test set:
The dimension of the dataset is 784:
Create a feature extractor using the "MissingImputed" method:
Replace some values of a test-set vector by Missing[] and visualize it:
Impute missing values using the FeatureExtractorFunction[…]:
Visualize the original image, the image with missing values, and the imputed image:
Properties & Relations (4)
Train a feature extractor from data with named features:
Unrecognized keys will be ignored:
FeatureExtraction[…,"ExtractedFeatures"] is equivalent to FeatureExtract[…]:
The "FeatureDistance" property is equivalent to using FeatureDistance on the extractor:
Compute the FeatureExtractorFunction first:
Construct a feature distance for this feature extractor:
The two distance functions are identical:
Creating a FeatureExtractorFunction on some training data creates a feature space representing those features:
Using different training data can result in a sized feature space:
Creating the same item with no data will result in a untrained function that will consistently give the same results in the same feature space:
Possible Issues (7)
Training an extractor on anonymous data will use automatic feature names:
Using custom names when applying the function will give a feature missing error:

Feature names can be specified at training time:
Check the feature names of a FeatureExtractorFunction:
The custom name can now be used:
The FeatureExtraction property "ReconstructedData" can be used to obtain the data after extraction and reconstruction:
Some feature extractors can only perform an approximation of the inverse extraction:
Some feature extractors cannot be inverted:

The property "ReconstructedData" cannot be used without training data:

Some extractors can be created without needing data:
Others require examples to initialize them:

Similarity, not all properties are supported:

Extractors that do not match the data type are ignored:
The input type is "Nominal", so the "LowerCasedText" extractor ignores the input type:
Similarly, forcing the input to "Text" will cause the "IndicatorVector" to be ignored:
The "ConformedData" extractor requires additional information to operate in a data-free context:

Specifying the FeatureTypes explicitly:
The feature type can also be implicitly inferred from subsequent extractors:
The automatic feature extraction often applies a dimension reduction step:
Explicit feature extractors do not include dimensional reduction and typically result in longer vectors:
Use the "DimensionReducedVector" to add a dimension reduction step:
Dimension reduction must be trained on the available features and therefore cannot be applied when no data is provided:

Related Guides
History
Introduced in 2016 (11.0) | Updated in 2017 (11.2) ▪ 2020 (12.1) ▪ 2020 (12.2) ▪ 2021 (12.3) ▪ 2025 (14.3)
Text
Wolfram Research (2016), FeatureExtraction, Wolfram Language function, https://reference.wolfram.com/language/ref/FeatureExtraction.html (updated 2025).
CMS
Wolfram Language. 2016. "FeatureExtraction." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2025. https://reference.wolfram.com/language/ref/FeatureExtraction.html.
APA
Wolfram Language. (2016). FeatureExtraction. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/FeatureExtraction.html
BibTeX
@misc{reference.wolfram_2025_featureextraction, author="Wolfram Research", title="{FeatureExtraction}", year="2025", howpublished="\url{https://reference.wolfram.com/language/ref/FeatureExtraction.html}", note=[Accessed: 08-August-2025]}
BibLaTeX
@online{reference.wolfram_2025_featureextraction, organization={Wolfram Research}, title={FeatureExtraction}, year={2025}, url={https://reference.wolfram.com/language/ref/FeatureExtraction.html}, note=[Accessed: 08-August-2025]}