Unsupervised Machine Learning

Topic
Overview  »

Unsupervised machine learning is the attempt to analyze untagged data and discover hidden relationships. It finds hidden patterns, clusters of similar examples, underlying data distributions or simpler data representations. Common use cases are disease diagnosis, market basket analysis, consumer grouping and anomaly detection. Unsupervised machine learning also helps with data visualization. The Wolfram Language offers a large collection of unsupervised learning methods, accessible via goal-based functions that automate a large part of the processing pipeline (feature selection and extraction, model selection and cross-validation, ) and make possible analysis of all kinds of data such as text, images or graphs, beyond just the standard arrays.

Cluster Analysis »

ClusterClassify classify data into clusters

ClusteringMeasurements analyze the result of a clustering process

FindClusters  ▪  ClusteringTree  ▪  ClusteringComponents  ▪  ...

Feature Extraction

FeatureExtraction find how to extract features from data

FeatureExtract  ▪  FeatureExtractorFunction  ▪  FeatureNearest

Anomaly Detection

AnomalyDetection learn an anomaly detector function from data

FindAnomalies  ▪  DeleteAnomalies  ▪  AnomalyDetectorFunction  ▪  RarerProbability

Dimensionality Reduction

DimensionReduction find how to project data onto lower-dimensional space

DimensionReduce  ▪  DimensionReducerFunction

Distribution Modeling

LearnDistribution learn the underlying distribution of any type of data

FindDistribution find a representation for data in terms of named distributions

SynthesizeMissingValues fill in missing values by imputing from existing data

Visualization

FeatureSpacePlot visualize dimension-reduced feature space in 2D

FeatureSpacePlot3D visualize dimension-reduced feature space in 3D

FeatureValueImpactPlot plot the impact of a feature value on a model result

FeatureImpactPlot plot the impact of each feature together

CumulativeFeatureImpactPlot plot the cumulative impact of each feature

FeatureValueDependencyPlot plot the result dependency on a feature value

Dendrogram visualize hierarchical clusters

Specific Unsupervised Learning Methods

FindGraphCommunities find communities or clusters in graphs

SmoothKernelDistribution find kernel density estimates for data

FindHiddenMarkovStates infer hidden Markov states from a sequence of data

Eigensystem  ▪  SingularValueDecomposition  ▪  PrincipalComponents  ▪  KarhunenLoeveDecomposition

Unsupervised Learning Methods »

"GaussianMixture" use a mixture of Gaussian (normal) distributions

"KernelDensityEstimation" use a kernel mixture distribution

"Multinormal" use a multivariate normal (Gaussian) distribution

"KMeans"  ▪  "DBSCAN"  ▪  "Autoencoder"  ▪  "TSNE"  ▪  "UMAP"  ▪  ...