gives the matrix of distances between each pair of elements ui, uj.
gives the matrix of distances between each pair of elements ui, vj.
Details and Options
- DistanceMatrix works for a variety of data, including numerical, geospatial, textual, visual, dates and times, as well as combinations of these.
- Each ui can be a single data element, a list of data elements or an association of data elements. In DistanceMatrix[data,…], data can also be a Dataset object.
- The following options can be given:
DistanceFunction Automatic the distance metric to use FeatureExtractor Identity how to preprocess data FeatureNames Automatic feature names to assign for data FeatureTypes Automatic feature types to assume for data PerformanceGoal Automatic aspects of performance to try to optimize RandomSeeding 1234 what seeding of pseudorandom generators should be done internally WorkingPrecision Automatic precision to use for numerical data
- The setting for DistanceFunction can be any distance or dissimilarity function or a function f defining a distance between two values.
- By default, the following distance functions are used for different types of elements:
EuclideanDistance numeric data ImageDistance images JaccardDissimilarity Boolean data EditDistance text and nominal sequences Abs[DateDifference[#1,#2]]& dates and times ColorDistance colors GeoDistance geospatial data Boole[SameQ[#1,#2]]& nominal data HammingDistance nominal vector data WarpingDistance numerical sequences
- All images are first conformed using ConformImages.
- By default, when data elements are mixed-type vectors, distances are computed independently for each type and combined using Norm.
- Possible settings for PerformanceGoal include:
"Speed" minimize computation time "Quality" maximize precision and accuracy Automatic automatic tradeoff between speed and precision
- Possible settings for RandomSeeding include:
Automatic automatically reseed every time the function is called Inherited use externally seeded random numbers seed use an explicit integer or strings as a seed
Examplesopen allclose all
Basic Examples (3)
Data can also be given in a Dataset object:
Use FeatureNames to name features, and refer to their names in further specifications:
Use FeatureTypes to enforce the interpretation of the first feature as nominal:
Perform the same operation with PerformanceGoal set to "Speed":
When PerformanceGoal"Speed", centering the data can increase the precision:
DistanceMatrix gives the same result when evaluated multiple times, even when randomness is involved.
Use different values for the RandomSeeding option to compute the distance matrices:
DistanceMatrix uses arbitrary-precision computation:
When vectors are similar, changing the value of WorkingPrecision can lead to significantly different results: