FASTQ (.fastq, .fq)

Import and Export support all common variants of the FASTQ file format, including short-read sequencing data and long sequences.

Background & Context

- MIME type: chemical/seq-na-fastq
- FASTQ molecular biology format.
- Standard format for storing and exchanging DNA sequences with base qualities.
- Plain text format.
- Stores nucleic acid sequences and base qualities as character strings.
- Various conventions are in use to represent meta-information.

Import & Export

Import["file.fastq"] imports DNA sequences from a FASTQ file.
Export["file.fastq",expr] exports a sequence or a list of sequences to the FASTQ format.
Import["file.fastq"] returns a list of strings representing the sequences stored in the file.
Export["file.fastq",{seq,qual}] exports a character string representing a DNA sequence with base qualities to FASTQ.
Export["file.fastq",{{seq₁,seq₂,…},{qual₁,qual₂,…}}] exports multiple DNA sequences with base qualities.
Import["file.fastq",elem] imports the specified element from a FASTQ file.
Import["file.fastq",{{elem₁,elem₂,…}}] imports multiple elements.
The import format can be specified with Import["file","FASTQ"] or Import["file",{"FASTQ",elem,…}].
Export["file.fastq",expr,elem] creates a FASTQ file by treating expr as specifying element elem.
Export["file.fastq",{expr₁,expr₂,…},{{elem₁,elem₂,…}}] treats each expr_i as specifying the corresponding elem_i.
Export["file.fastq",expr,opt₁->val₁,…] exports expr with the specified option elements taken to have the specified values.
Export["file.fastq",{elem₁->expr₁,elem₂->expr₂,…},"Rules"] uses rules to specify the elements to be exported.
See the following reference pages for full general information:

	Import, Export	import from or export to a file
	CloudImport, CloudExport	import from or export to a cloud object
	ImportString, ExportString	import from or export to a string
	ImportByteArray, ExportByteArray	import from or export to a byte array

Import Elements

General Import elements:
"Elements" list of elements and options available in this file

"Summary" summary of the file

"Rules" list of rules for all available elements
Data representation elements:
"Header" raw header lines

"Sequence" DNA sequences as a list of strings

"Qualities" base qualities as a list of strings
Import uses the "Sequence" element by default for the FASTQ format.
Additional data elements:
"Data" "Header", "Sequence", and "Qualities" elements combined in a list

"LabeledData" list of rules for each sequence stored in the file
The Wolfram Language uses the standard IUB/IUPAC abbreviations for nucleic acids:

	A	adenosine
	C	cytidine
	G	guanine
	T	thymidine
	U	uracil
	R	purine (G or A)
	Y	pyrimidine (T or C)
	K	ketone (G or T)
	M	amino group (A or C)
	S	strong interaction (G or C)
	W	weak interaction (A or T)
	B	C or G or T
	D	A or G or T
	H	A or C or T
	V	A or C or G
	N	any nucleic acid (A or C or G or T)
	-	gap of indeterminate length

The Wolfram Language uses ASCII characters for the base qualities.

Options

Advanced Export option:
"LineWidth" 70 maximum number of characters in a line

Examples

Basic Examples (6)

This reads the raw header lines from a sample FASTQ file:

Read the DNA sequence:

Read the DNA sequence with qualities:

This converts a short sequence to the FASTQ format, automatically adding default header information:

This exports two sequences:

This exports a pair of headers and sequences:

Importing the previous output using the "Data" element gives raw headers and sequences:

Import as a list of rules:

Top

More Learning

Tech Support

Educational Programs for Adults

Educational Programs for Youth

Events

Wolfram Initiatives

Educational Resources

Hobbies & Projects

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Read

Educational Programs for Adults

Educational Programs for Youth

Events

FASTQ (.fastq, .fq)

Background & Context

Import & Export

Import Elements

Options

Examples

Basic Examples (6)

	"Elements"	list of elements and options available in this file
	"Summary"	summary of the file
	"Rules"	list of rules for all available elements

	"Header"	raw header lines
	"Sequence"	DNA sequences as a list of strings
	"Qualities"	base qualities as a list of strings

	"Data"	"Header", "Sequence", and "Qualities" elements combined in a list
	"LabeledData"	list of rules for each sequence stored in the file

FASTQ (.fastq, .fq)

Background & Context

Import & Export

Import Elements

Options

Examples

Basic Examples (6)

See Also

Related Guides

History