Mathematica 9 is now available
THIS IS DOCUMENTATION FOR AN OBSOLETE PRODUCT.
SEE THE DOCUMENTATION CENTER FOR THE LATEST INFORMATION.
Mathematica > Data Manipulation > Numerical Data > Exploratory Data Analysis >
Mathematica > Data Manipulation > Statistics > Exploratory Data Analysis >
Mathematica > Mathematics and Algorithms > Statistics > Exploratory Data Analysis >

FindClusters

FindClusters[{e1, e2, ...}]
partitions the e_i into clusters of similar elements.
FindClusters[{e1->v1, e2->v2, ...}]
returns the v_i corresponding to the e_i in each cluster.
FindClusters[{e1, e2, ...}->{v1, v2, ...}]
gives the same result.
FindClusters[{e1, e2, ...}, n]
partitions the e_i into exactly n clusters.
  • If the e_i are lists of True and False, FindClusters by default uses a distance function based on the normalized fraction of elements that disagree.
  • If the e_i are strings, FindClusters by default uses a distance function based on the number of point changes needed to get from one string to another.
  • A Method option can be used to specify different methods of clustering. Possible settings include:
"Agglomerate"find clustering hierarchically
"Optimize"find clustering by local optimization
Find clusters of nearby values:
Find exactly four clusters:
Represent clustered elements with the right-hand sides of each rule:
Find clusters of nearby values:
In[1]:=
Click for copyable input
Out[1]=
 
Find exactly four clusters:
In[1]:=
Click for copyable input
Out[1]=
 
Represent clustered elements with the right-hand sides of each rule:
In[1]:=
Click for copyable input
Out[1]=
Cluster vectors of real values:
Cluster data of any precision:
Cluster Boolean 0, 1 or True, False data:
Cluster string data:
Find clusters in 10^4 five-dimensional vectors:
Use ManhattanDistance as the measure of distance for continuous data:
Clusters obtained with the default SquaredEuclideanDistance:
Use DiceDissimilarity as the measure of distance for Boolean data:
Clusters obtained with the default JaccardDissimilarity:
Use HammingDistance as the measure of distance for string data:
Clusters obtained with the default EditDistance:
Define a distance function as a pure function:
Cluster the data hierarchically:
Clusters obtained with the default method:
Find and visualize clusters in bivariate data:
Cluster genomic sequences based on the number of element-wise differences:
FindClusters groups data while Nearest gives the elements closest to a given value:
The order of elements can have an effect on the clusters found:
Divide a square into n segments by clustering uniformly distributed random points:
Cluster words beginning with "ax" in the English dictionary:
New in 6
Ask a question about this page  |  Suggest an improvement  |  Leave a message for the team