Sequence Alignment & Comparison

The Wolfram Language includes state-of-the-art algorithms for sequence alignment and comparison, capable of handling strings and lists containing very large numbers of elements.

SequenceAlignment find alignments between strings, allowing insertion and deletion

Diff return a representation of diffs between two expressions

Diff3 return a representation of the three-way diff between two expressions and their common ancestor

LongestCommonSubsequence find the longest contiguous subsequence in common

LongestCommonSequence find the longest sequence in common, perhaps disjoint

LongestCommonSubsequencePositions  ▪  LongestCommonSequencePositions

StringPosition  ▪  StringCases  ▪  StringCount

SequencePosition  ▪  SequenceCases  ▪  SequenceCount

Subsequences all subsequences of a list

LongestOrderedSequence find the longest ordered sequence in a list, perhaps disjoint

WarpingCorrespondence  ▪  CanonicalWarpingCorrespondence

Similarity and Distance Measures »

SmithWatermanSimilarity  ▪  NeedlemanWunschSimilarity

EditDistance  ▪  DamerauLevenshteinDistance  ▪  HammingDistance  ▪  ...

Nearest find sequences nearest with respect to a distance measure

FindClusters find clusters of sequences with respect to a distance measure

Dendrogram  ▪  ClusteringTree

DistanceMatrix construct the matrix of pairwise distances

BioSequence a string-based representation of chained structures such as DNA

GenomeLookup find exact matches with human and other genomes