GenBank (.gb, .gbk)

MIME type: chemical/seq-na-genbank
GenBank molecular biology format.
Native format of the U.S. National Center for Biotechnology Information (NCBI) database.
Standard format for storing and exchanging annotated DNA sequences.
Plain text format.
Developed in 1982 as part of the NIH GenBank project.
  • Import supports all versions of the GenBank file format.

Import and ExportImport and Export

  • Import[""] imports a DNA sequence from a GenBank file.
  • Import[""] returns a string representing the sequence stored in the file.
  • Import["", elem] imports the specified element from a GenBank file.
  • Import["", {elem, suba, subb, ...}] imports a subelement.
  • Import["", {{elem1, elem2, ...}}] imports multiple elements.
  • The import format can be specified with Import["file", "GenBank"] or Import["file", {"GenBank", elem, ...}].
  • See the reference pages for full general information on Import and Export.
  • ImportString supports the GenBank format.


  • General Import elements:
  • "Elements"list of elements and options available in this file
    "Rules"full list of rules for each element and option
    "Options"list of rules for options, properties, and settings
  • Data representation elements:
  • "Features"all sequence annotations, given as a list of rules
    "Sequence"DNA or protein sequence as a string
    "Plaintext"sequences as formatted text
    "Comment"miscellaneous comments on sequence
  • Import uses the element by default for the GenBank format.
  • Metainformation elements:
  • "Locus"locus description
    "Definition"GenBank file title
    "NCBIAccession"NCBI accession number
    "NCBIAccessionVersion"versioned NCBI accession number
    "GenBankID"GenBank database identifier
    "Project"name of the sequencing project
    "Keywords"list of keywords
    "Organism"source organism referenced in the file
    "Segment"sequence segment, if divided into multiple GenBank files
    "Source"source organism
    "Reference"bibliographic reference, given as a list of rules
    "Comments"comments stored in the file, given as a list of strings

ExamplesExamplesopen allclose all

Basic Examples (6)Basic Examples (6)

This returns the available elements for a sample GenBank file:

Click for copyable input

File title:

Click for copyable input

Basic locus information:

Click for copyable input

Import information about the source organism:

Click for copyable input

Extract the accession number and GenBank identifier:

Click for copyable input

Read the first letters of the DNA sequence:

Click for copyable input

Import a plaintext version of the sequence:

Click for copyable input

Read a list of bibliographic references and extract the first one:

Click for copyable input
New in 7
New to Mathematica? Find your learning path »
Have a question? Ask support »