"Agglomerate" (Machine Learning Method)
- Method for FindClusters, ClusterClassify and ClusteringComponents.
- Partitions data into clusters of similar elements using a hierarchical agglomerative clustering method.
Details & Suboptions
- "Agglomerate" is a hierarchical clustering method. "Agglomerate" works well when clusters have similar densities and are isotropic; however, it can fail when clusters have different sizes and it is sensitive to the choice of the dissimilarity function.
- The following plots show the results of the "Agglomerate" method applied to toy datasets:
- "Agglomerate" constructs a hierarchy of clusters. At the beginning of the procedure, the method assigns every data point to a cluster, then it iteratively merges similar clusters. At each iteration, the most similar clusters are combined. The combination stops when the specified number of clusters is reached. When the number of clusters is not specified, the combination stops when all cluster pairs are above a given similarity threshold.
- The following suboption can be given:
ClusterDissimilarityFunction "Single" linkage for the cluster dissimilarity
- Cluster dissimilarity is defined by a linkage function, which can be specified by the suboption ClusterDissimilarityFunction. Possible settings for ClusterDissimilarityFunction include:
"Single" distance between the clusters' closest points "Complete" distance between the clusters' furthest points "Average" average distance between the clusters' points "Centroid" distance between the cluster means "Median" distance between the cluster medians "Ward" average squared distance between the clusters' points "WeightedAverage" weighted-average intercluster dissimilarity f pure function
- "Complete" linkage tends to break clusters in contrast to "Average", "Centroid", "Median" and "Ward", which tend to find relatively compact and distant regular-sized clusters.
Examplesopen allclose all
Basic Examples (3)
Create a ClassifierFunction to identify two clusters of images using the "Agglomerate" method:
Obtain nearby clusters using different ClusterDissimilarityFunction:
Introduced in 2020