"KMedoids" (Machine Learning Method)
- Method for FindClusters, ClusterClassify and ClusteringComponents.
- Partitions data into clusters of similar elements using a k-medoids clustering algorithm.
Details & Suboptions
- The "KMedoids" method, also known as Partitioning Around Medoids (PAM), is a simple and fast centroid-based method. "KMedoids" is good when clusters have similar sizes and are locally distributed around their centroid (a.k.a. medoids). When clusters have very different sizes, are intertwined, or when outliers are present, it is likely that "KMedoids" will give poor results.
- The following plots show the results of the "KMedoids" method applied to toy datasets:
- The "KMedoids" method aims to find k medoids defining k clusters. Each data point is assigned to its nearest medoid. All points assigned to a given medoid are forming a cluster.
- The procedure to find the best k medoids is the same as "KMeans", except that the medoids are not defined as the mean of a cluster. Instead, a cluster medoid is defined as the data point in the cluster that is the most central, that is, the data point whose average distance to other points in the cluster is minimal. Because "KMedoids" does not compute means like "KMeans", it can be used in non-numeric spaces (a distance function is sufficient).
- Since the initial centroids are chosen randomly, results might differ upon evaluation.
- The suboption "InitialCentroids" can be used to specify the initial centroids as a list of data points. Each initial centroid must match an existing data point.
- The following suboption can be given:
"InitialCentroids" Automatic a list of initial centroids
Examplesopen allclose all
Basic Examples (3)
Train a ClassifierFunction on a list of strings:
Possible Issues (1)
Train a ClassifierFunction using "KMedoids" for two clusters and find clusters in the test set: