Chemical & Biomolecular Formats

The Wolfram Language can importand often exportstandard formats used in chemistry, molecular biology and bioinformatics, routinely handling a full range of molecular types, as well as genome-sized datasets.

Chemical Formats

"XYZ" XYZ molecule geometry file (.xyz)

"MOL" MDL MOL format (.mol)

"MOL2" Tripos MOL2 format (.mol2)

"SDF" MDL SDF format (.sdf)

"SMILES" SMILES chemical format (.smi)

"HIN" HyperChem molecular data format (.hin)

"CML" Chemical Markup Language (.cml)

"CDX" ChemDraw Exchange format (.cdx)

"CDXML" ChemDraw Exchange XML format (.cdxml)

"Cube" Gaussian Cube file (.cub)

"FCHK" Formatted Checkpoint file (.fchk)

"GaussianLog" Gaussian log file (.log)

"JCAMP-DX" chemical spectroscopy format (.jdx, .dx, .jcm)

Bioinformatics Formats

"GenBank" NCBI GenBank sequence format (.gb, .gbk)

"FASTA" DNA, RNA, and amino acid sequence format (.fasta, .fa, .fsa, .mpfa)

"FASTQ" DNA and RNA sequence format with base qualities (.fastq, .fq)

"NEXUS" NEXUS phylogenetic data format (.nex, .ndk)

"AgilentMicroarray" microarray data format (.txt)

"Affymetrix" microarray data format (.cel, .cdf, .chp, .gin, .psi)

"SFF" DNA sequence flowgram format (.sff)

Molecular Biology Formats

"PDB" Protein Data Bank format (.pdb)

"MMCIF" MMCIF 3D molecular model format (.cif)

"FCS" flow cytometry data format (.fcs, .lmd)

Common Elements

"Molecule" a symbolic representation of the molecule model

"StructureDiagram" chemical structure diagram

"Graphics3D" 3D molecular graphics

"VertexCoordinates" 3D coordinates of atoms

"Sequence" base-pair or amino acid sequence

"Elements" all available elements