MBOX (.mbox)
- Import supports all common variants of the MBOX file format.
Background & Context
-
- MIME type: application/mbox
- Unix mailbox format.
- Holds a collection of email messages.
- Native archive format of email clients such as Unix mail, Thunderbird, and many others.
- Textual format with encoded binary data.
- Stores messages in EML format, concatenated with separator lines.
- Supports RFC 4155.
Import
- Import["file.mbox"] imports an MBOX file, returning a list of message summaries given as associations.
- Import["file.mbox"] returns an expression of the form {msg1,msg2,…}, where the msgi are associations giving basic elements of individual mail messages.
- Import["file.mbox",elem] imports the specified element from an MBOX file.
- Import["file.mbox",{elem,suba,subb,…}] imports a subelement.
- Import["file.mbox",{{elem1,elem2,…}}] imports multiple elements.
- The import format can be specified with Import["file","MBOX"] or Import["file",{"MBOX",elem,…}].
- See the following reference pages for full general information:
-
Import import from a file CloudImport import from a cloud object ImportString import from a string ImportByteArray import from a byte array
Import Elements
- General Import elements:
-
"Elements" list of elements and options available in this file "Summary" summary of the file "Rules" list of rules for all available elements - Complete mailbox elements:
-
"MessageSummaries" list of associations giving basic elements for each message "MessageElements" list of associations giving main elements for each message "FullMessageElements" list of associations giving all available message elements "MessageCount" number of messages appearing in the mailbox - Import by default uses the "MessageSummaries" element.
- Summary elements:
-
"From" sender names and email addresses "ToList" lists of recipient names and addresses "CcList" - lists of copied recipient names and addresses
"BccList" lists of blind-copied recipient names and addresses "OriginatingDate" client dates and times from email headers "Subject" subjects of the emails "BodyPreview" list of short previews of message bodies "HasAttachments" whether each message contains any attachments "MessageID" message ID for each message - "MessageSummary" includes all summary elements.
- Additional message elements:
-
"FromAddress" sender raw email addresses "FromName" sender full names "ToAddressList" lists of recipient addresses "ToNameList" lists of recipient full names "CcAddressList" lists of copied recipient addresses "CcNameList" lists of copied recipient full names "BccAddressList" lists of blind-copied recipient addresses "BccNameList" lists of blind-copied recipient full names "ReplyToList" lists of reply-to names and addresses "ReplyToAddressList" lists of reply-to addresses "ReplyToNameList" lists of reply-to full names "Body" message bodies as strings "AttachmentList" lists of processed attachments as expressions - "MessageElements" includes all summary and message elements excluding "BodyPreview" and "HasAttachments".
- More detailed information for each email can be imported from the following categories.
- Message-body elements:
-
"BodyPreview" list of short previews of message bodies "Body" message bodies as strings "NewBodyContent" parts of the bodies that are not replies or forwards "QuotedContent" parts of the bodies that are quoted - Threading elements:
-
"ThreadCount" number of threads in the mailbox "ThreadGraph" threads in the mailbox represented as a Graph "ThreadEmailCount" number of emails in each thread "ThreadTimeInterval" interval from the first to last email in each thread "ThreadDuration" duration from first to last email in each thread "ThreadMessageIDList" list of message IDs for all emails in each thread "ThreadFromList" list of senders for each thread "ReferenceMessageIDGraph" a Graph of connections to "reference" messages - Message-routing elements:
-
"Precedence" declared mail precedences "ReturnPath" declared return paths for the mail "ReturnReceiptRequested" whether return receipts are requested "DeliveryChainHostnames" lists of hostnames on mail delivery chains "DeliveryChainRecords" lists of full records on mail delivery chains - Mail-header elements:
-
"Plaintext" complete raw email as a string "HeaderString" complete email headers as a string "HeaderRules" list of rules for all headers "CharacterEncoding" character encoding for email content "ContentType" MIME content type of email body "MIMEVersion" version of the MIME standard "ReplyToMessageID" lists of any IDs of messages to which each message replies "ReferenceMessageIDList" ID of "reference" messages, typically on a thread - Message-origination elements:
-
"OriginatingMailClient" types of originating mail clients "OriginatingIPAddress" IP addresses of originating client machines "OriginatingHostname" hostnames of originating client machines "OriginatingCountry" geoIP-inferred originating countries "OriginatingDate" client dates and times from email headers "OriginatingTimeZone" client time zones based on email headers "ServerOriginatingDate" dates and times on originating servers "ServerOriginatingTimeZone" time zones of originating servers - Attachment elements:
-
"HasAttachments" whether each message contains any attachments "AttachmentNames" list of attachment names "AttachmentList" lists of processed attachments as expressions "AttachmentSummaries" lists of associations giving basic attachment elements "AttachmentData" lists of associations giving raw encoded data and metadata "AttachmentDecodedData" lists of associations giving raw decoded data and metadata "AttachmentDetails" lists of associations giving attachment content and metadata - The element "AttachmentDetails" is lists giving an association for each attachment. The elements of this association are typically as follows:
-
"Name" name assigned to the attachment "MIMEType" MIME type of the content "Content" imported content "ContentDisposition" content disposition of the attachment "ModificationDate" modification date recorded for the attachment "ByteCount" number of bytes in the decoded content - The element "AttachmentDecodedData" is lists giving an association for each attachment. The elements of this association are typically as follows:
-
"Name" name assigned to the attachment "MIMEType" MIME type of the content "DecodedContent" raw decoded content as a byte array "ContentDisposition" content disposition of the attachment "ModificationDate" modification date recorded for the attachment "ByteCount" number of bytes in the raw decoded content - The element "AttachmentData" is lists giving an association for each attachment. The elements of this association are typically as follows:
-
"Name" name assigned to the attachment "MIMEType" MIME type of the content "RawContent" raw encoded content as a string "ContentTransferEncoding" content transfer encoding of "RawContent" "ContentDisposition" content disposition of the attachment "ModificationDate" modification date recorded for the attachment "ByteCount" number of bytes in the raw encoded content - "AttachmentSummaries" includes "Name", "MIMEType" and the "ByteCount" of the decoded contents for each attachment.
- Subelements for partial data import for a message element elem can take message specifications in the form {elem,msgs}, where msgs can be any of the following:
-
n nth email -n counts from the end messageid specific email message ID {spec1,spec2,…} a list of email indices or message IDs - Subelements can also be given in the form {elem,msgs,keys} for "FullMessageElements", "MessageElements" and "MessageSummaries" where keys can be any element in the association.
- Subelements for accessing part of a thread element elem in the form of {elem,spec} can take the following specification spec:
-
n nth thread, based on the starting data messageid the thread containing the specific message ID
Options
- Import option:
-
"AttachmentRules" < > rules to control how to import attachments - Possible settings for "AttachmentRules" are an association containing:
-
fmtNone import attachments of format fmt as None fmtelem Import element elem when importing fmt attachments fmtfun use a pure function fun on the decoded byte array
The format specification fmt can be any format supported by $ImportFormats or a MIME type.
Examples
open allclose allBasic Examples (3)
Scope (6)
Import Elements (62)
Data Representation (10)
"MessageSummaries" (2)
"MessagesElements" (2)
"FullMessageElements" (1)
Content Parsing (2)
Threading Elements (8)
Mail Address Header Elements (19)
General Header Elements (4)
Advanced Header Elements (11)
"DeliveryChainRecords" (1)
Import the delivery chain record as an Association: