SequenceAlignment

SequenceAlignment[s1,s2]

finds an optimal alignment of sequences of elements in the strings, lists or biomolecular sequences s1 and s2, and yields a list of successive matching and differing sequences.

Details and Options

  • SequenceAlignment[s1,s2] gives a list of the form {seg1,seg2,} where each segi is either a single string or sequence of list elements u, representing a matching segment, or a pair {u1,u2}, representing segments that differ between the si.
  • SequenceAlignment by default finds a global NeedlemanWunsch alignment of the complete strings or lists s1 and s2.
  • With the option setting Method->"Local", it finds a local SmithWaterman alignment.
  • For sufficiently similar strings or lists, local and global alignment methods give the same result.
  • The following options can be given:
  • GapPenalty0additional cost for each alignment gap
    IgnoreCaseFalsewhether to ignore case of letters in strings
    MergeDifferencesTruewhether to combine adjacent differences
    Method"Global"alignment algorithm to be used
    SimilarityRulesAutomaticrules for similarities between elements
  • SequenceAlignment attempts to find an alignment that maximizes the total similarity score.
  • With the default setting SimilarityRules->Automatic, each match between two elements contributes 1 to the total similarity score, while each mismatch, insertion, or deletion contributes -1.
  • Various named similarity matrices are supported, as specified in the notes for SimilarityRules.

Examples

open allclose all

Basic Examples  (3)

Globally align two similar strings:

Global alignment of two strings:

Local alignment of the same strings:

Global alignment of two instances of BioSequence:

Options  (4)

SimilarityRules  (2)

Align two short protein sequences:

Assigning a negative score to the deletion of "V" gives a different alignment:

Align with type-specific similarity rules that align degenerate letters:

Without the degenerate similarity rules, a perfect degenerate alignment is missed:

GapPenalty  (1)

By default, an alignment is found with two gaps:

Increasing the penalty for gaps forces another alignment with fewer gaps:

MergeDifferences  (1)

This gives insertions, deletions, and replacements as separate differences:

Applications  (2)

This gives the global alignment of two similar strings:

This shows the difference between global and local string alignment:

Neat Examples  (1)

Compare two very similar genes:

Wolfram Research (2008), SequenceAlignment, Wolfram Language function, https://reference.wolfram.com/language/ref/SequenceAlignment.html (updated 2020).

Text

Wolfram Research (2008), SequenceAlignment, Wolfram Language function, https://reference.wolfram.com/language/ref/SequenceAlignment.html (updated 2020).

BibTeX

@misc{reference.wolfram_2020_sequencealignment, author="Wolfram Research", title="{SequenceAlignment}", year="2020", howpublished="\url{https://reference.wolfram.com/language/ref/SequenceAlignment.html}", note=[Accessed: 15-April-2021 ]}

BibLaTeX

@online{reference.wolfram_2020_sequencealignment, organization={Wolfram Research}, title={SequenceAlignment}, year={2020}, url={https://reference.wolfram.com/language/ref/SequenceAlignment.html}, note=[Accessed: 15-April-2021 ]}

CMS

Wolfram Language. 2008. "SequenceAlignment." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2020. https://reference.wolfram.com/language/ref/SequenceAlignment.html.

APA

Wolfram Language. (2008). SequenceAlignment. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/SequenceAlignment.html