Developing an Import Converter

The Wolfram Language provides functions that allow users to write their own file format converters and integrate them with the Wolfram Language Import and Export framework. You can implement format converters and use Import to import data from arbitrary formats.

The Wolfram System also includes source code that illustrates how to implement and register format converters. These can be found in the folders $InstallationDirectory/SystemFiles/Formats/format, where format is one of the following: BDF, DIF, MTP, SMILES, SurferGrid, TGF, or TLE. The registration code is placed in the files Import.m or Export.m, and the converter implementations reside in the file Converter.m.

The interface between Import and low-level converter functions is specified by (under the context). In essence, tells the Import and Export framework how to call specific functions when importing specific elements of a file format.

The following terminology is used throughout this tutorial:

A low-level function takes a file or stream as input and returns a list of rules containing the imported data. There are two types of low-level functions: (1) the default importer, which is called by the framework when importing an element not explicitly registered; and (2) the conditional importer, which imports a specific element registered in the second argument of .

A post-importer or post-import function, registered in the third argument of , takes as input the output of the low-level functions.

There are several forms of , summarized below. Throughout this tutorial are progressively advanced examples, showing all uses of in detail.

ImportExport`RegisterImport["format",defaultFunction]
register a single defaultFunction to be used by the Import framework as default importer when importing a file of the type
ImportExport`RegisterImport["format",{"elem1"conditionalFunction1,"elem2"conditionalFunction2,,defaultFunction}]register multiple elements (, , ) and respective converter functions (, , ) to be used by the Import framework; also register defaultFunction to be used when an element requested does not match any registered elements
ImportExport`RegisterImport["format",{conditionalFuncs,defaultFunction},{"elem3"postFunction3,"elem4"postFunction4,}]register additional converter functions whose input is the output of one of the low-level functions

Default Importer

For example, suppose you have a file format containing three header lines followed by four columns of numbers.

Registration and Implementation of a New Format

One possible design is to import the header information and the numbers, respectively, to the and elements. This can also be implemented using .

In[2]:=
Click for copyable input

In this particular case, you are telling the Import and Export framework to call the function when importing any element of the format .

By default, the framework passes the file name to the low-level function, so takes as input the file name and a set of options. This function must return a list of rules in the form of .

In[3]:=
Click for copyable input

Importing a File of the New Format

Import can now use as a valid file format.

In[4]:=
Click for copyable input
Out[4]=
In[5]:=
Click for copyable input
Out[5]=
In[6]:=
Click for copyable input
Out[6]=

Conditional Raw Importers

When a format contains many elements, it may be useful and efficient to import specific elements with specific low-level functions. This can be achieved by giving a list of rules in the form of "elem"->func as the second argument of . The list, however, must end with the name of the default importer, which is called when importing elements that do not match any explicitly defined in the list.

Registration and Implementation of a New Format with Conditional Importers

This registration tells the Import and Export framework how to import files of the format :
(1) use when importing the element, and
(2) for all other elements.

In[7]:=
Click for copyable input

The low-level functions again have the same structure, taking a file name and (optionally) a list of options, and returning a set of rules in the form of .

In[8]:=
Click for copyable input
In[9]:=
Click for copyable input

Import Using MyFormat2

The output of the import elements of is the same as those of , but now two different functions are called for the two different elements.

In[10]:=
Click for copyable input
Out[10]=
In[11]:=
Click for copyable input
Out[11]=

Specifying Subelements

By default, the framework imports subelements using Part.

In[12]:=
Click for copyable input
Out[12]=

For files containing several large datasets, it may be efficient to directly import specific datasets. For example, you can directly import a dataset from a file with the "EDF" file format.

In[13]:=
Click for copyable input
Out[13]=
In[35]:=
Click for copyable input
In[38]:=
Click for copyable input
Out[38]=

You can specify the import of subelements by registering a low-level function in the form of
.

In[16]:=
Click for copyable input

The output of the low-level function must match the form .

In[17]:=
Click for copyable input

As before, the output of the other low-level functions must be a list of rules in the form of .

In[18]:=
Click for copyable input
In[19]:=
Click for copyable input

The import of string subelements now calls the appropriate low-level function.

In[20]:=
Click for copyable input
Out[20]=
In[21]:=
Click for copyable input
Out[21]=

Post-Importers

It may be the case that you have to build elements based on other elements. For example, if the data to be imported is a list of numbers representing a grayscale image, then importing the element requires first importing the element. In this section are two examples of this using the and elements.

The post-importer takes as input the output of the conditional importer when a matching element name exists; otherwise, the post-importer takes as input the output of the default importer.

Unlike conditional and default importers, the post-importer simply returns the value of the element.

Registration and Implementation of a New Format with Post-Importers

To illustrate the differences between a conditional importer and a post-importer, is extended with two additional elements: and . The element is imported via a conditional importer. The element, however, is imported via a post-importer.

The registration below tells the Import and Export framework how to import files of the format :
(1) for the or elements, call the corresponding conditional importers,
(2) for the element, call the default importer first, and use its output as input for , and
(3) for all other elements, call the default importer.

In[22]:=
Click for copyable input

The conditional and default importers have the same structure as before.

In[23]:=
Click for copyable input
In[24]:=
Click for copyable input

Notice that the importer has to explicitly call the default importer and extract the data manually.

In[25]:=
Click for copyable input

Since no element is registered as a conditional importer, the importer of the element takes as input the output of the default importer.

In[26]:=
Click for copyable input

Import Using MyFormat3

From a user perspective, there is no difference between an element implemented using a post-import function or low-level function.

The element is registered as a conditional importer.
In[27]:=
Click for copyable input
Out[27]=
The import of the element calls the post-importer .
In[28]:=
Click for copyable input
Out[28]=

Options to RegisterImport

has several options that allow great flexibility.

"FunctionChannels" and "BinaryFormat"

In our example above, the low-level functions accept a file name as an argument, and the functions open a stream to the file. The framework can directly pass an InputStream to the low-level functions by specifying as an option to .

By specifying the option "BinaryFormat"->True, the framework passes a binary stream to the low-level importer.

The default value of is . The default value of is False.

Example

For the format registered as

the signature of eFunc is , and the framework passes a (non-binary) stream to eFunc.

"AvailableElements"

By default, when importing an element not explicitly registered as a conditional importer or a post-importer, the framework evaluates the default importer. If no matching element is found in the default importer, the framework generates an error message and returns $Failed.

By specifying the option when attempting to import an element not present in the specified list, the framework will directly return $Failed and generate an error message without calling any low-level importer.

Example

For the format registered as

when you call Import[filename,{"format","foo"}], the framework will return $Failed without evaluating the default importer eDefaultFunc.

Note that it is an error to specify . In this case, Import[filename,{"format","elem2"}] will return $Failed because is not in the list specified by .

"DefaultElement"

Specifying "DefaultElement"->elem, where elem is the name of an element, the framework imports the elem when no Import element is specified.

"Sources"

The option can be used to specify file paths to .m, .mx, or Wolfram Symbolic Transfer Protocol (WSTP) .exe files that contain definitions of the low-level functions. The framework will automatically use Get or Install appropriately for the source files.