DistanceMatrix
DistanceMatrix[{u1,u2,…}]
gives the matrix of distances between each pair of elements ui, uj.
DistanceMatrix[{u1,u2,…},{v1,v2,…}]
gives the matrix of distances between each pair of elements ui, vj.
Details and Options
- DistanceMatrix works for a variety of data, including numerical, geospatial, textual, visual, dates and times, as well as combinations of these.
- Each ui can be a single data element, a list of data elements or an association of data elements. In DistanceMatrix[data,…], data can also be a Dataset object.
- The following options can be given:
-
DistanceFunction Automatic the distance metric to use FeatureExtractor Identity how to preprocess data FeatureNames Automatic feature names to assign for data FeatureTypes Automatic feature types to assume for data PerformanceGoal Automatic aspects of performance to try to optimize RandomSeeding 1234 what seeding of pseudorandom generators should be done internally WorkingPrecision Automatic precision to use for numerical data - The setting for DistanceFunction can be any distance or dissimilarity function or a function f defining a distance between two values.
- By default, the following distance functions are used for different types of elements:
-
EuclideanDistance numeric data ImageDistance images JaccardDissimilarity Boolean data EditDistance text and nominal sequences Abs[DateDifference[#1,#2]]& dates and times ColorDistance colors GeoDistance geospatial data Boole[SameQ[#1,#2]]& nominal data HammingDistance nominal vector data WarpingDistance numerical sequences - For images, colors or audio objects and a distance function f, DistanceFunction->f is passed to ImageDistance, ColorDistance or AudioDistance, respectively. »
- All images are first conformed using ConformImages.
- By default, when data elements are mixed-type vectors, distances are computed independently for each type and combined using Norm.
- Possible settings for PerformanceGoal include:
-
"Speed" minimize computation time "Quality" maximize precision and accuracy Automatic automatic tradeoff between speed and precision - Possible settings for RandomSeeding include:
-
Automatic automatically reseed every time the function is called Inherited use externally seeded random numbers seed use an explicit integer or strings as a seed
Examples
open allclose allBasic Examples (3)
Scope (10)
Compute a distance matrix from images:
Compute a distance matrix from strings:
Compute a distance matrix from Boolean vectors:
Compute a distance matrix from a list of date objects:
Compute a distance matrix from geodetic positions:
Compute a distance matrix from nominal sequences:
Compute a distance matrix from numerical sequences:
Compute a distance matrix on nominal vectors:
Compute a distance matrix from mixed-type vectors:
Compute a distance matrix from a dataset formatted as a list of associations:
Compute the same distance matrix with a column-oriented dataset:
Data can also be given in a Dataset object:
Options (9)
DistanceFunction (3)
Compute a distance matrix from integer vectors using SquaredEuclideanDistance as a distance function:
Compute a distance matrix with the ManhattanDistance:
FeatureExtractor (1)
FeatureNames (1)
Use FeatureNames to name features, and refer to their names in further specifications:
FeatureTypes (1)
Use FeatureTypes to enforce the interpretation of the first feature as nominal:
PerformanceGoal (1)
Generate 2000 random numerical vectors of length 1000:
Compute their distance matrix and benchmark the operation:
Perform the same operation with PerformanceGoal set to "Speed":
Compare timing and accuracies of the previous results with a reference:
When PerformanceGoal"Speed", centering the data can increase the precision:
RandomSeeding (1)
DistanceMatrix gives the same result when evaluated multiple times, even when randomness is involved.
Generate a pair of 20-dimensional vectors:
Compute its distance matrix several times using a feature extractor involving randomness:
Use different values for the RandomSeeding option to compute the distance matrices:
WorkingPrecision (1)
Compute the distance matrix for 500 random numerical vectors of length 100 that have a precision of 30:
DistanceMatrix uses arbitrary-precision computation:
Using WorkingPrecisionMachinePrecision can speed up the computation:
But the results are not as precise:
When vectors are similar, changing the value of WorkingPrecision can lead to significantly different results:
Text
Wolfram Research (2015), DistanceMatrix, Wolfram Language function, https://reference.wolfram.com/language/ref/DistanceMatrix.html (updated 2017).
CMS
Wolfram Language. 2015. "DistanceMatrix." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2017. https://reference.wolfram.com/language/ref/DistanceMatrix.html.
APA
Wolfram Language. (2015). DistanceMatrix. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/DistanceMatrix.html