Sequence Alignment & Comparison

The Wolfram Language includes state-of-the-art algorithms for sequence alignment and comparison, capable of handling strings and lists containing very large numbers of elements.

SequenceAlignment find alignments between strings, allowing insertion and deletion

LongestCommonSubsequence find the longest contiguous subsequence in common

LongestCommonSequence find the longest sequence in common, perhaps disjoint

LongestCommonSubsequencePositions  ▪  LongestCommonSequencePositions

StringPosition  ▪  StringCases  ▪  StringCount

SequencePosition  ▪  SequenceCases  ▪  SequenceCount

Subsequences all subsequences of a list

LongestOrderedSequence find the longest ordered sequence in a list, perhaps disjoint

WarpingCorrespondence  ▪  CanonicalWarpingCorrespondence

Similarity and Distance Measures »

SmithWatermanSimilarity  ▪  NeedlemanWunschSimilarity

EditDistance  ▪  DamerauLevenshteinDistance  ▪  HammingDistance  ▪  ...

Nearest find sequences nearest with respect to a distance measure

FindClusters find clusters of sequences with respect to a distance measure

Dendrogram  ▪  ClusteringTree

DistanceMatrix construct the matrix of pairwise distances

BioSequence a string-based representation of chained structures such as DNA

GenomeLookup find exact matches with human and other genomes