MissingValueSynthesis
is an option for functions such as Classify that specifies how missing values should be replaced.
Details
- Missing value synthesis, also known as missing imputation, is done by conditioning a distribution on known values, as in SynthesizeMissingValues.
- Missing values are typically represented by Missing[…].
- MissingValueSynthesis can be used at training time, inference time or to update the synthesizer of an existing model.
- Classify[data,…,MissingValueSynthesissynth] can be used to specify a missing synthesis method or model for training (and similarly for other training functions).
- ClassifierFunction[…][example,MissingValueSynthesissynth] can be used to temporarily overwrite the synthesis method during classifier inference (and similarly for other machine learning models).
- Classify[ClassifierFunction[…],MissingValueSynthesissynth] can be used to overwrite the internal missing synthesizer of the classifier (and similarly for other machine learning models).
- Possible settings for MissingValueSynthesis include:
-
Automatic automatically choose distribution method and synthesis strategy None do not use any missing synthesizer method use the specified method strategy how to synthesize from the distribution assoc specify both distribution method and synthesis strategy - Possible settings for method include:
-
Automatic automatically choose the distribution method "Multinormal" use a multivariate normal (Gaussian) distribution "ContingencyTable" discretize data and store each possible probability "KernelDensityEstimation" use a kernel mixture distribution "DecisionTree" use a decision tree to compute probabilities "GaussianMixture" use a mixture of Gaussian (normal) distributions LearnedDistribution[…] use the specified distribution - Possible settings for strategy include:
-
Automatic automatically choose the synthesis strategy "RandomSampling" randomly sample from the conditioned distribution "ModeFinding" attempt to find the mode of the conditioned distribution - In the form Methodassoc, the association assoc should be of the form <"LearningMethod"method,"EvaluationStrategy"strategy >.
Examples
Basic Examples (2)
Train a predictor with two input features:
Get the prediction for an example that has a missing value:
Set the missing value synthesis to replace missing variables with their most likely value given known values (which is the default behavior):
Replace missing variables with random samples conditioned on known values:
Averaging over many random imputations is usually the best strategy and allows obtaining the uncertainty caused by the imputation:
Specify a learning method during training to control how the distribution of data is learned:
Predict an example with missing values using the "KernelDensityEstimation" distribution to condition values:
Provide an existing LearnedDistribution at training to use it when imputing missing values during training and later evaluations:
Specify an existing LearnedDistribution to synthesize missing values for an individual evaluation:
Control both the learning method and the evaluation strategy by passing an association at training:
Train a classifier with two input features:
Get class probabilities for an example that has a missing value:
Set the missing value synthesis to replace missing variables with their most likely value given known values (which is the default behavior):
Replace missing variables with random samples conditioned on known values:
Averaging over many random imputations is usually the best strategy and allows obtaining the uncertainty caused by the imputation:
Text
Wolfram Research (2021), MissingValueSynthesis, Wolfram Language function, https://reference.wolfram.com/language/ref/MissingValueSynthesis.html.
CMS
Wolfram Language. 2021. "MissingValueSynthesis." Wolfram Language & System Documentation Center. Wolfram Research. https://reference.wolfram.com/language/ref/MissingValueSynthesis.html.
APA
Wolfram Language. (2021). MissingValueSynthesis. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/MissingValueSynthesis.html