FeatureExtraction[examples]
生成用给定样例训练的 FeatureExtractorFunction[…].
FeatureExtraction[examples,spec]
使用指定的特征提取方法 spec.
FeatureExtraction[examples,spec,props]
给出由 props 指定的特征提取属性.


FeatureExtraction
FeatureExtraction[examples]
生成用给定样例训练的 FeatureExtractorFunction[…].
FeatureExtraction[examples,spec]
使用指定的特征提取方法 spec.
FeatureExtraction[examples,spec,props]
给出由 props 指定的特征提取属性.
更多信息和选项








- FeatureExtraction is typically used to define a function that processes raw data into usable features (e.g. for training a machine learning algorithm).
- FeatureExtraction 可用于许多类型的数据,包括数字、文本、音频、图像、图和时间序列,以及这些类型的组合.
- Possible values of examples are:
-
{example1,…} a list of training examples Dataset[…] a Dataset object Tabular[…] a Tabular object None no training examples - 各 examplei 可以是单一数据元素、数据元素列表或者数据元素关联.
- Possible values for spec include:
-
extractor use the specified extractor method partextractor apply the extractor to the specific example part {part1extractor1,…} specify extractors for specific parts - Possible feature extractor methods extractor include:
-
Automatic automatic extraction Identity give data unchanged "ConformedData" conformed images, colors, dates, etc. "NumericVector" numeric vector from any data "name" a named extractor method f applies function f to each example {extractor1,extractor2,…} use a sequence of extractors in turn - Possible forms of part are:
-
All 各样例的所有部分 i 各样例的第 i 个部分
{i1,i2,…} 各样例的部分 i1、i2、… "key" part with the specified key in each example {"key1","key2",…} 在各样例中名为 "keyi" 的部分 - When explicitly specifying parts, any unmentioned parts are dropped when extracting features.
- FeatureExtraction[examples] 等价于 FeatureExtraction[examples,Automatic],通常等价于 FeatureExtraction[examples,"NumericVector"].
- "NumericVector" 方法通常会将样例转换为数值向量,估算缺失值,并使用 DimensionReduction 降低维度.
- Feature extractor methods specific to a single data type are applied only to data elements with whose types they are compatible. Other data elements are returned unchanged.
- Not all specific feature extractors are available when the examples is None.
- The specific extractors are:
- 数值数据:
-
"DiscretizedVector" 离散化的数值数据 "DimensionReducedVector" 降维的数值向量 "MissingImputed" 缺失值被估算的数据 "StandardizedVector" 用 Standardize 处理过的数值数据 - 标称数据:
-
"IndicatorVector" 用指示向量“独热编码”的名义数据 "IntegerVector" 用整数编码的名义数据 - 文本:
-
"LowerCasedText" 每个字符均为小写的文本 "SegmentedCharacters" 分割成字符的文本 "SegmentedWords" 分割成单词的文本 "SentenceVector" 文字的语义向量 "TFIDF" 词频逆向文件频率向量 "WordVectors" 文字的语义向量序列(仅限英文) - 图像:
-
"FaceFeatures" 来自人脸图像的语义向量 "ImageFeatures" 图像的语义向量 "PixelVector" 图像像素值向量 - 音频对象:
-
"AudioFeatures" 音频对象的语义向量序列 "AudioFeatureVector" 音频对象的语义向量 "LPC" 音频线性预测系数 "MelSpectrogram" 用对数频次分组的音频频谱图 "MFCC" 音频梅尔频率倒谱系数向量序列 "SpeakerFeatures" 讲话者的语义向量序列 "SpeakerFeatureVector" 讲话者的语义向量 "Spectrogram" 音频频谱图 - 视频对象:
-
"VideoFeatures" 来自视频对象的语义向量序列 "VideoFeatureVector" 来自视频对象的语义向量 - 图:
-
"GraphFeatures" 总结图的属性的数值向量 - 分子:
-
"AtomPairs" 来自原子对的布尔向量以及它们之间的路径长度 "MoleculeExtendedConnectivity" 来自枚举的分子子图的布尔向量 "MoleculeFeatures" 概括分子属性的数值向量 "MoleculeTopologicalFeatures" 来自圆形原子邻域的布尔向量 - 在 FeatureExtraction[examples,extractors,props] 中,props 可以是单一属性或属性列表. 可能的属性包括:
-
"ExtractorFunction" FeatureExtractorFunction[…](默认) "ExtractedFeatures" 特征提取之后的 examples "ReconstructedData" 提取和逆提取之后的 examples "FeatureDistance" 从提取程序生成的 FeatureDistance[…] - The "ExtractedFeatures" and "ReconstructedData" properties are not available when examples is None.
- The "ReconstructedData" property can be computed only when every specified extractor is invertible.
- 可以给出以下选项:
-
FeatureNames Automatic 赋给 examplei 的元素的名称 FeatureTypes Automatic 对 examplei 的元素要假定的特征类型 RandomSeeding 1234 应该在内部对伪随机数生成器进行什么样的初始化 - RandomSeeding 的可能设置包括:
-
Automatic 每次函数调用时自动重新播种 Inherited 使用外部播种的随机数字 seed 用明确给定的整数或字符串作为种子
Extractors
Properties
Options
范例
打开所有单元 关闭所有单元基本范例 (3)
在简单的数据集上训练 FeatureExtractorFunction:
范围 (32)
Input Shape (9)
Train a feature extractor on a list of examples with a single feature:
Extract features from multiple new examples:
Train a feature extractor on a list of examples with multiple features:
Extract features from multiple new examples:
Train a feature extractor on a mixed-type dataset:
Train a feature extractor from a list of associations:
Extract features from multiple new examples:
Train a feature extractor from data given as feature lists:
从 Tabular 提取特征:
从 Dataset 提取特征:
Train a feature extractor from a dataset that contains missing values:
Extractor Specifications (10)
Specify the feature extractor "SentenceVector" on a single textual feature:
Train a feature extractor using the "StandardizedVector" method:
Extract features from a new example:
Since this feature extractor is invertible, the FeatureExtractorFunction property "OriginalData" can be used to perform the inverse extraction:
Train a feature extractor on text using the "TFIDF" method followed by the "DimensionReducedVector" method:
Extract features on the training set:
Train a feature extractor on texts and images using the text-only "TFIDF" method:
Features will only be extracted from the text part:
Specify the feature extraction on multiple features by position:
Use the feature extractor on new features:
A list of two items will be assumed to be a single input of two features:
Train a feature extractor with the "IndicatorVector" method on only the second nominal variable:
The first nominal variable is dropped:
Use the Identity extractor method to copy the first variable:
A variable can be copied multiple times:
Specify the feature extraction on multiple features by key:
Use the feature extractor on new features:
Using the feature extractor on a list will assume the same ordering of features as originally specified:
Generate a feature extractor using a custom function:
Apply the extractor on the training set:
Chain the custom extractor with the "StandardizedVector" method:
Feature Types (10)
Create a feature extractor for textual data using the "SentenceVector" extractor with no training:
Input type is inferred from the specified extractor. Use the feature extractor on some examples:
Create a feature extractor for examples with implicit textual and image features:
Features will be extracted from both parts:
Train a feature extractor on textual data:
Train a feature extractor with the "IndicatorVector" method on nominal variables:
训练集的词频逆向文件频率矩阵可以在 SparseArray 中计算:
Train a feature extractor on a list of DateObject instances:
从新的 DateObject 中提取特征:
A string date can also be given:
Train a feature extractor on a list of Graph instances:
Train a feature extractor on a list of TimeSeries instances:
Train a feature extractor on Molecule data:
Train a feature extractor on a list of Audio instances:
Information (3)
Get Information from a trained FeatureExtractorFunction:
选项 (4)
FeatureNames (2)
使用 FeatureNames 设置名称,并在 FeatureExtraction[examples,{spec1ext1,…}] 中引用它们:
Extract features on a new example using the names to specify the features:
FeatureTypes (2)
在简单数据集上通过 "IndicatorVector" 训练特征提取程序:
第一个特征被解释为数值型. 由于 "IndicatorVector" 方法仅作用于名义特征,第一个特征不变:
使用 FeatureTypes 执行作为名义的第一个特征的诠释:
Now both features are encoded as indicator vectors:
Creating a feature extractor with no training infers the expected data type from the specific extractor:
应用 (3)
图像搜索 (1)
生成关于数据集的提取特征的 NearestFunction:
使用 NearestFunction,构建一个函数,显示数据集的最相近图像:
文本搜索 (1)
生成带有单句特征的 NearestFunction:
使用 NearestFunction,构建一个函数,显示 Alice in Wonderland 中最相近的单句:
估算 (1)
从 ExampleData 中加载 "MNIST" 数据集,并保留图像:
使用 "MissingImputed" 方法创建特征提取程序:
用 Missing[] 替换测试集向量的某些值,并可视化:
使用 FeatureExtractorFunction[…] 估算缺失值:
属性和关系 (4)
Train a feature extractor from data with named features:
Unrecognized keys will be ignored:
FeatureExtraction[…,"ExtractedFeatures"] is equivalent to FeatureExtract[…]:
The "FeatureDistance" property is equivalent to using FeatureDistance on the extractor:
Compute the FeatureExtractorFunction first:
Construct a feature distance for this feature extractor:
The two distance functions are identical:
Creating a FeatureExtractorFunction on some training data creates a feature space representing those features:
Using different training data can result in a sized feature space:
Creating the same item with no data will result in a untrained function that will consistently give the same results in the same feature space:
可能存在的问题 (7)
Training an extractor on anonymous data will use automatic feature names:
Using custom names when applying the function will give a feature missing error:

Feature names can be specified at training time:
Check the feature names of a FeatureExtractorFunction:
The custom name can now be used:
The FeatureExtraction property "ReconstructedData" can be used to obtain the data after extraction and reconstruction:
Some feature extractors can only perform an approximation of the inverse extraction:
Some feature extractors cannot be inverted:

The property "ReconstructedData" cannot be used without training data:

Some extractors can be created without needing data:
Others require examples to initialize them:

Similarity, not all properties are supported:

Extractors that do not match the data type are ignored:
The input type is "Nominal", so the "LowerCasedText" extractor ignores the input type:
Similarly, forcing the input to "Text" will cause the "IndicatorVector" to be ignored:
The "ConformedData" extractor requires additional information to operate in a data-free context:

Specifying the FeatureTypes explicitly:
The feature type can also be implicitly inferred from subsequent extractors:
The automatic feature extraction often applies a dimension reduction step:
Explicit feature extractors do not include dimensional reduction and typically result in longer vectors:
Use the "DimensionReducedVector" to add a dimension reduction step:
Dimension reduction must be trained on the available features and therefore cannot be applied when no data is provided:

文本
Wolfram Research (2016),FeatureExtraction,Wolfram 语言函数,https://reference.wolfram.com/language/ref/FeatureExtraction.html (更新于 2021 年).
CMS
Wolfram 语言. 2016. "FeatureExtraction." Wolfram 语言与系统参考资料中心. Wolfram Research. 最新版本 2021. https://reference.wolfram.com/language/ref/FeatureExtraction.html.
APA
Wolfram 语言. (2016). FeatureExtraction. Wolfram 语言与系统参考资料中心. 追溯自 https://reference.wolfram.com/language/ref/FeatureExtraction.html 年
BibTeX
@misc{reference.wolfram_2025_featureextraction, author="Wolfram Research", title="{FeatureExtraction}", year="2021", howpublished="\url{https://reference.wolfram.com/language/ref/FeatureExtraction.html}", note=[Accessed: 17-October-2025]}
BibLaTeX
@online{reference.wolfram_2025_featureextraction, organization={Wolfram Research}, title={FeatureExtraction}, year={2021}, url={https://reference.wolfram.com/language/ref/FeatureExtraction.html}, note=[Accessed: 17-October-2025]}