This is documentation for Mathematica 8, which was
based on an earlier version of the Wolfram Language.
View current documentation (Version 11.2)

GenomeData

GenomeData
gives the DNA sequence for the specified gene on the reference human genome.
GenomeData
gives the value of the specified property for the human gene gene.
GenomeData
gives the sequence from positions to on chromosome chr in the reference human genome.
  • Genes are specified by standard names such as .
  • Human chromosomes can be specified as , , , etc., or by integers through , , , and .
  • GenomeData gives the 5' to 3' sequence from positions to on the top strand of chromosome chr. Sequence positions are measured relative to the 5' end of the top strand.
  • GenomeData gives the 5' to 3' sequence from positions to on the bottom strand of chromosome chr. Sequence positions are measured relative to the 5' end of the bottom strand.
  • Gene sequence properties include:
"FullSequence"the full sequence for the gene
"FullSequencePosition"start and end positions of the gene
"SequenceLength"length of the gene in base pairs
  • Gene location properties include:
"Chromosome"chromosome on which the gene is located
"LocusList"locus for the gene as a list
"LocusString"locus for the gene as a string
"Orientation"forward (5' to 3') or reverse (3' to 5') as +1 or -1
  • Protein and transcription properties include:
"CodingSequenceLists"lists of coding sequences for the gene
"CodingSequencePositions"lists of region positions for each coding sequence
"CodingSequences"concatenated coding sequences for the gene
"ExonSequences"list of sequences of exons for the gene
"IntronSequences"list of sequences of introns for the gene
"ProteinNames"names of the proteins coded for by the gene
"UTRSequences"list of sequences of untranslated terminal regions of the gene
  • Functional properties include:
"BiologicalProcesses"biological processes associated with gene products
"CellularComponents"cellular components in which gene products are found
"InteractingGenes"genes interacting with this gene or its products
"MolecularFunctions"molecular functions of gene products
  • Gene identification properties include:
"AlternateNames"common synonyms
"GenBankIndices"GenBank index number strings
"GeneID"GeneID number string
"GeneOntologyIDs"Gene Ontology ID strings
"MIMNumbers"Mendelian Inheritance in Man index number strings
"Name"common English name
"NCBIAccessions"NCBI accession strings
"ProteinGenBankIndices"GenBank index number strings for protein products
"ProteinNCBIAccessions"NCBI accession strings for protein products
"StandardName"standard Mathematica name
"TranscriptGenBankIndices"GenBank index number strings for RNA products
"TranscriptNCBIAccessions"NCBI accession strings for RNA products
"UniProtAccessions"UniProt accession strings
  • Overall properties of chromosomes include:
"SequenceLength"length of the chromosome in base pairs
"UnsequencedPositions"start and end positions where the sequence is unknown
  • Properties related to lists of bands for chromosomes include:
"GBandLocusStrings"names of G-band loci
"GBandScaledPositions"scaled start and end positions of all G-bands
"GBandStainingCodes"cytogenetic staining codes for all G-bands
"GBandStainingLevels"relative staining levels for all G-bands
  • GenomeData gives various annotations associated with a property. Typical annotations include:
"Name"common English names
"StandardName"standard Mathematica names
"Units"units in which values are given
Get the full DNA sequence of a gene on the human genome:
Get the DNA sequence for part of a chromosome:
Get a list of genes on a chromosome:
Get the Mathematica standard name of the chromosome where a gene resides:
Get the chromosome position of a gene:
Get the full DNA sequence of a gene on the human genome:
In[1]:=
Click for copyable input
Out[1]//Short=
In[2]:=
Click for copyable input
Out[2]=
 
Get the DNA sequence for part of a chromosome:
In[1]:=
Click for copyable input
Out[1]=
 
Get a list of genes on a chromosome:
In[1]:=
Click for copyable input
Out[1]//Short=
 
Get the Mathematica standard name of the chromosome where a gene resides:
In[1]:=
Click for copyable input
Out[1]=
 
Get the chromosome position of a gene:
In[1]:=
Click for copyable input
Out[1]=
Obtain a list of gene names:
Find the English name of a gene:
Get a list of gene classes:
Get a list of genes involved in signal transduction:
Get a list of classes a gene belongs to:
Test whether a gene belongs to a class:
Get the DNA sequence of a gene:
Get the chromosome position of a gene:
Get the Mathematica standard name of the chromosome where a gene resides:
Get the orientation of the gene on the chromosome:
Get the DNA sequence for part of a chromosome on the bottom strand:
Get the positions of coding sequences for a gene:
Make a log plot of the distribution of lengths of human chromosomes:
Make a log rank plot of the lengths of genes for human chromosome 22:
Make a plot of average coding sequence length versus gene length:
Visualize a gene as an image:
Show the first 20 genes on chromosome 12:
Get a sequence from the top strand of chromosome 1:
Get the complementary sequence from the bottom strand:
Show that the bottom strand is complementary to the top strand:
Use GenomeData to extract the sequences found by GenomeLookup:
Find the 5 shortest genes in the human genome:
New in 7