FASTQ (.fastq, .fq)

  • Import and Export support all common variants of the FASTQ file format, including short-read sequencing data and long sequences.
  • Background

      MIME type: chemical/seq-na-fastq
      FASTQ molecular biology format.
      Standard format for storing and exchanging DNA sequences with base qualities.
      Plain text format.
      Stores nucleic acid sequences and base qualities as character strings.
      Various conventions are in use to represent meta-information.

    Import and Export

    • Import["file.fastq"] imports DNA sequences from a FASTQ file.
    • Export["file.fastq",expr] exports a sequence or a list of sequences to the FASTQ format.
    • Import["file.fastq"] returns a list of strings representing the sequences stored in the file.
    • Export["file.fastq",{seq,qual}] exports a character string representing a DNA sequence with base qualities to FASTQ.
    • Export["file.fastq",{{seq1,seq2,},{qual1,qual2,}}] exports multiple DNA sequences with base qualities.
    • Import["file.fastq",elem] imports the specified element from a FASTQ file.
    • Import["file.fastq",{{elem1,elem2,}}] imports multiple elements.
    • The import format can be specified with Import["file","FASTQ"] or Import["file",{"FASTQ",elem,}].
    • Export["file.fastq",expr,elem] creates a FASTQ file by treating expr as specifying element elem.
    • Export["file.fastq",{expr1,expr2,},{{elem1,elem2,}}] treats each expri as specifying the corresponding elemi.
    • Export["file.fastq",expr,opt1->val1,] exports expr with the specified option elements taken to have the specified values.
    • Export["file.fastq",{elem1->expr1,elem2->expr2,},"Rules"] uses rules to specify the elements to be exported.
    • See the reference pages for full general information on Import and Export.
    • ImportString and ExportString support the FASTQ format.


    • General Import elements:
    • "Elements"list of elements and options available in this file
      "Rules"full list of rules for each element and option
      "Options"list of rules for options, properties, and settings
    • Data representation elements:
    • "Header"raw header lines
      "Sequence"DNA sequences as a list of strings
      "Qualities"base qualities as a list of strings
    • Import uses the "Sequence" element by default for the FASTQ format.
    • Additional data elements:
    • "Data""Header", "Sequence", and "Qualities" elements combined in a list
      "LabeledData"list of rules for each sequence stored in the file
    • The Wolfram Language uses the standard IUB/IUPAC abbreviations for nucleic acids:
    • Aadenosine
      Rpurine (G or A)
      Ypyrimidine (T or C)
      Kketone (G or T)
      Mamino group (A or C)
      Sstrong interaction (G or C)
      Wweak interaction (A or T)
      BC or G or T
      DA or G or T
      HA or C or T
      VA or C or G
      Nany nucleic acid (A or C or G or T)
      -gap of indeterminate length
    • The Wolfram Language uses ASCII characters for the base qualities.


    • Advanced Export option:
    • "LineWidth"70maximum number of characters in a line


    open allclose all
    Basic Examples  (0)

    Scope (6)

    This reads the raw header lines from a sample FASTQ file:

    Click for copyable input

    Read the DNA sequence:

    Click for copyable input

    Read the DNA sequence with qualities:

    Click for copyable input

    This converts a short sequence to the FASTQ format, automatically adding default header information:

    Click for copyable input

    This exports two sequences:

    Click for copyable input

    This exports a pair of headers and sequences:

    Click for copyable input

    Importing the previous output using the "Data" element gives raw headers and sequences:

    Click for copyable input

    Import as a list of rules:

    Click for copyable input

    See Also

    "MOL"  "PDB"  "XYZ"  "FASTA"  "SFF"

    Introduced in 2012