"ESMAtlas" (Service Connection)

Connect to the ESMAtlas API using the Wolfram Language to get information about protein structures predicted by ESMFold.

Connecting & Authenticating

ServiceConnect["ESMAtlas"] creates a connection to the ESMAtlas API. If a previously saved connection can be found, it will be used; otherwise, a new authentication request will be launched.

Requests

ServiceExecute["ESMAtlas","request",params] sends a request to the ESMAtlas API, using parameters params. The following give possible requests.

BioMolecule Structures

Request:

"FoldSequence" generate 3D coordinates for a peptide sequence

Parameters:
  • "BioSequence"Nonesequence to be folded, either a BioSequence or string
  • Request:

    "PredictedStructure" get the predicted structure of a BioMolecule from the ESM Metagenomic Atlas database

    Parameters:
  • "MGnifyID"NoneMGnifyID of the predicted structure
  • Properties of Predicted Structures

    Request:

    "StructureConfidencePrediction" get a dataset with values indicating the confidence behind the 3D embedding of the sequence.

    Parameters:
  • "MGnifyID"NoneMGnifyID of the predicted structure
  • Request:

    "Sequence" returns the BioSequence of a biomolecule from the ESM Metagenomic Atlas using the "MGnifyID"

    Parameters:
  • "MGnifyID"NoneMGnifyID of the predicted structure
  • Request:

    "SequenceEmbedding" returns the 2560-dimensional embedding vector after averaging over the final layer activations of the ESM2 model over the sequence length.

    Parameters:
  • "MGnifyID"NoneMGnifyID of the predicted structure
  • Examples

    Basic Examples  (5)

    Create a new connection by launching an authentication dialog:

    Fold a BioSequence of length 400 residues or less and obtain the predicted BioMolecule structure:

    Visualize the structure of the BioMolecule:

    Get the predicted structure of a predicted structure from the ESM Metagenomic Atlas by providing the "MGnifyID":

    Visualize the structure obtained from the database:

    Get different values of prediction confidence of structures in the ESM Metagenomic Atlas:

    The confidence data has the following keys:

  • "PredictedAlignedError"a measure of how confident a prediction model is in the relative position of two residues within the predicted structure
    "PredictedLocalDistanceDifferenceTest"a measure of local confidence that is an indicator of how confident the model is on an individual residue level
    "PredictedTemplateModelingScore"derived from the template modeling score that measures the global accuracy of the protein, which is to a large extent independent of local inaccuracies in prediction
  • Visualize the confidence of prediction:

    Visualize the predicted alignment error, which is a measure of how confident a prediction model is in the relative position of two residues within the predicted structure. The measure is in angstroms, and the larger the value, the worse the confidence:

    Get the BioSequence of a biomolecule from the ESM Metagenomic Atlas using the "MGnifyID" of the biomolecule:

    Get the embedding vector after averaging over the final layer activations of the ESM2 model over the sequence length for a given protein using their "MGnifyID":