Foreign Function Interface

Introduction

The foreign function interface (FFI) allows the Wolfram Language to call functions exported from external libraries. The libraries need to be dynamic shared libraries but do not need modification to add a special interface layer. If a library can be used from another program, such as one written in C, it should be callable with FFI. It is also fast to set up, since no compiled code is created.

Some of the features of the Wolfram Language FFI include working with a range of types (both atomic and composite), supporting the use of callbacks, working with very low-level structures and integrating with an automatic memory management system.

A key function is ForeignFunctionLoad, which is used here to load the function "addone" from the example library "compilerDemoBase", giving argument and result types. The result is a ForeignFunction object:

The function can be called in the normal way for a Wolfram Language function:

When this function is called, the input, which is an expression and comes from the Wolfram interpreter, is converted into the native C type. This form of conversion is a key part of the Wolfram Compiler.

The Wolfram Language functions that work with libraries using FFI can take the library as a relative or an absolute path. Also, the platform-specific extension can be left out.

Libraries

A useful function for working with libraries is FindLibrary, which will return the absolute pathname, looking in paclets and other library paths. It also takes care of adding the appropriate extension for your platform:

The compilerDemoBase library is included in the Wolfram Language distribution. You can see its source as follows:

This shows the addone function, along with a number of others that will be used for examples in this tutorial:

The source includes a header file "WolframLibrary.h" only to get a definition for the macro DLLEXPORT. The macro specifies that the function is to be exported from the library; the details of how this is done are different on different platforms. This is the only reason this header file is included.

You can also load existing libraries. Alternatively, you can create your own libraries from within the Wolfram Language. This is quite useful for learning about using FFI and will be used in other examples in this tutorial.

Function Arguments and Results

Arguments to and results from FFI functions are determined by specifying their types, with support for a range of different types. There is a list on the function page for ForeignFunctionLoad. Some types are atomic types such as "CInt" or "CDouble" that map onto standard definitions for the platform. They are also supported in the Wolfram Compiler. This section will show how to work with different types.

Pointers

Pointers are blocks of memory that can be passed into functions, perhaps for storing an individual value; they are supported by the FFI functions. One example is the addonePointer function from the compilerDemoBase library; the C source is shown below.

int addonePointer (int in, int* out) {
    *out = in + 1;
    return 0;
}

This function can called with FFI using ForeignFunctionLoad as shown below. The pointer argument is given a "RawPointer" wrapping the thing that the pointer points toin this case, "CInt". Again, this form of type specification is very much part of the Wolfram Compiler:

Of course, this creates the function, but to call it you need to be able to make a Wolfram expression that works for the pointer. This can be done with RawMemoryAllocate:

The result from RawMemoryAllocate is actually a ManagedObject that contains a RawPointer. Managing something allows memory to be collected when it is no longer used and is discussed in detail in a later section. Typically, you can use a ManagedObject in places that expect the thing that it is managing.

This calls the function. The return value is 0, as expected:

To see the value stored in the pointer, you can use RawMemoryRead:

So this has shown how to load a function that stores a value in a pointer. A suitable pointer was created, the function was called and the stored value was retrieved.

The result of RawMemoryAllocate is actually a "RawPointer" wrapped in a "ManagedObject". This gives a memory management system to raw blocks of memory. The details of this are discussed in a later section.

One important type of pointer is "OpaqueRawPointer". This is equivalent to the C type void*, a pointer that does not really point to anything specific.

Arrays

Arrays are another form of memory usage, a block of memory that can store one or more elements of the same type; they are also supported by the FFI functions. One example is used in the populateArray function from the compilerDemoBase library as shown below.

int populateArray(long* arr, long len) {
    for(int i = 0; i < len; i++)
        arr[i] = i*i;
    return 0;
}

This function can be called with FFI with a suitable call to ForeignFunctionLoad. The array is given a "RawPointer" type with the type of the thing that the array holdsin this case "CLong":

Of course, this creates the function, but to call it you need to be able to make a Wolfram expression that works for the pointer. This can be done with RawMemoryAllocate:

Now you can call the function; the return value is 0 as expected:

The entire contents of the array can be read into a list with RawMemoryImport:

The value at the start of the array can be read with RawMemoryRead:

RawMemoryRead can also take an offset from the start of the array. (Note this is an offset, not a part number.):

RawMemoryImport can return a NumericArray; this is useful if you want to preserve the type of the data:

RawMemoryImport can return a ByteArray. This will read raw bytes from memory, so if your data elements are larger than one byte, you will have to adjust the length. In this case each integer has 8 bytes:

When you look at the actual bytes, you can see how they are stored to make up the larger data element (this is known as the Endianness). Working with raw memory can start to descend into low-level details such as this:

RawMemoryImport and RawMemoryExport are covered in more detail in a later section.

Strings

Strings are represented in the C language as arrays of bytes with a zero byte at the end. A C program that creates and returns a string is shown in the following.

it should be noted that this example is creating a library with the C language, but it could use any other language that is compatible with C to make the library.

This program can be compiled into a library with the Wolfram Language. (Note that this requires the installation of a C compiler on your machine.):

This loads the function using ForeignFunctionLoad. The return type is an array of "UnsignedInteger8":

The function is called. The result is a raw block of memory; it is unmanaged, so the memory will be lost if it is not freed somehow:

The raw data can be imported into a string. Since it is null-terminated, there is no need to give a length:

RawMemoryImport with the "String" format is useful because it figures out the length. But the length could be given:

Structs

Structs are data types typically introduced to hold several related data elements. They are supported in most compiled languages such as the C language. Example code that returns a pointer to a struct is shown below:

This program can be compiled into a library with the Wolfram Language. (Note that this requires the installation of a C compiler on your machine.):

This loads the function using ForeignFunctionLoad. The return is a pointer to a struct that contains a string and a "CInt". The struct is formed by placing the types of the fields inside { }. There is no need to give names to the fields:

The function is called. The result is a raw block of memory; it is unmanaged, so the memory will be lost if it is not freed (this is discussed in a later section):

Data can be imported with RawMemoryRead. It returns the fields of the struct in a list:

The string that was stored in the struct can be read:

Structs as Arguments

Structs can be passed as arguments. In this example, a struct is passed in (note that this is not a pointer to a struct):

Compile the library:

Load the function; the argument is a struct of two fields both with "CInt" type:

Calling the function, the struct is made by placing its arguments into a list {args}:

Structs Nested in Other Structs

A struct can be nested inside of another struct:

Compile the library:

Load the function; the argument is a struct of two fields both with "CInt" type:

Call the function; the result is a pointer to a struct:

Import the arguments of the pointer with RawMemoryRead. There are two fields, a "CInt" and a pointer to another struct:

This reads the contents of the inner struct:

Callback Functions

Callback functions allow external libraries to execute code passed from the environment that invoked them. An example is provided with the createArray function from the compilerDemoBase library as shown below.

long* createArray(long (*fun)(long)) {
    long* out = (long*)malloc(sizeof(long) * 10);
    for(long i = 0; i < 10; i++) {
        out[i] = fun(i);
    }
return out;
}

This function can be called with FFI with a suitable call to ForeignFunctionLoad as shown below. The callback is passed with an "OpaqueRawPointer" type, which means that many things can be passed in. But if something that is not a proper function is passed in, than bad things will happen:

A callback that calls the evaluator can be created easily with CreateForeignCallback:

This shows the data stored in the array:

Raw Data

This section discusses how raw data passed between foreign functions and the Wolfram Language is handled. Some key issues are creating data in the Wolfram Language, how it can be written to and read from and its management, such as how it can be freed.

Foreign Function Allocation

The createArray function from the compilerDemoBase library creates data by using the C memory allocator malloc as shown below. When this function is called, the allocation is done and the data is returned to the Wolfram Evaluator.

long* createArray(long (*fun)(long)) {
    long* out = (long*)malloc(sizeof(long) * 10);
    for(long i = 0; i < 10; i++) {
        out[i] = fun(i);
    }
return     out;
}

This memory can be freed with a call to the C memory function free. This could be done with a FFI function or with a Wolfram Compiler function. In order to have this done automatically, it could be wrapped in a ManagedObject as shown below.

Wolfram Language Allocation

Data can be created in the Wolfram Language to pass to FFI functions. RawMemoryAllocate creates uninitialized data and returns a managed raw pointer. Uninitialized data can be passed to an FFI function to be initialized there or it can be written to by RawMemoryWrite.

The following allocates a pointer to a 64-bit integer:

The following writes into the pointer:

Information about the pointer can be seen. More details about the nature of pointers is covered later:

RawMemoryAllocate can also allocate an array:

To fill in the elements of the array, you need a loop that uses an offset to write to each location. Remember that the offsets start at 0:

RawMemoryRead can read the elements; again, this has to be done in a loop with offsets starting at 0:

It is important to understand the difference between an array and a pointer to an array. In this example, a pointer to a raw pointer of "UnsignedInteger8" (which is a C string) is created:

To write to this, it needs a raw pointer of "UnsignedInteger8", which can be created with RawMemoryExport:

Now the pointer can be written to, using the pointer that holds the string:

RawMemoryRead reads the value that is stored but does not interpret it more deeply. So if you read from a pointer to a pointer, you get a pointer back:

Pointers to structs can be created:

Write data into a pointer to a struct:

The data can be read from the struct:

The details of how the result of RawMemoryAllocate is freed is shown later.

To summarize, RawMemoryAllocate can make pointers and arrays. RawMemoryWrite can fill them in, but only one element at a time. RawMemoryRead reads from pointers and arrays, but only one element at a time.

Exporting and Importing

RawMemoryExport and RawMemoryImport provide higher-level functions for working with raw pointers and arrays than the allocate, read and write functions.

The following creates a pointer to an "Integer16" initialized to the value 10:

If a list is used, an array is created initialized with the appropriate values:

This data can be imported into a list with RawMemoryImport:

It can also be imported into a NumericArray; this preserves the type of the data in the result:

It can also be imported into a ByteArray, which returns the result as bytes:

Since the array being imported was byte-sized, the result has the same length as the original data:

Here the array being imported has elements that contain two bytes, and this is reflected in the output byte array:

RawMemoryExport and RawMemoryImport are more convenient for working with array data.

Strings

Certain expressions, such as strings, have automatic conversions, so for a string, no type argument is needed (this is also the case for NumericArray and ByteArray):

When a string is imported, there is no need to specify a length:

If a length is specified, the string may be truncated:

The actual bytes stored in the string can be imported:

Strings also have issues concerning character encoding. This string has some non-ASCII characters:

They can be read in normally:

But if the actual bytes are examined, the default UTF8 encoding is shown:

The character encoding can be changed. Here ISOLatin1 is used. Now there are only 3 bytes stored:

This string cannot be imported with the default encoding without encountering errors:

If the correct character encoding is set, there is no error:

The default encoding of UTF8 is very common, so typically you do not need to think about these encoding issues.

Another issue for strings is zero termination if the string actually contains a zero byte. Here there is a string with one character, the zero byte. When it is imported, the zero byte's contents are interpreted as a zero byte termination:

You can see both zero bytes:

If the length of the string is specified, it can be read correctly:

Strings passed to programs that use C strings cannot contain zero byte contents, so this is really a fringe issue.

Structs

You can work with structs using RawMemoryExport. The following creates a pointer to a struct. Note the use of two levels of list. One is necessary to form the struct from its arguments and the other to specify that this is one element in an array:

You can read the data with RawMemoryImport:

If you want your struct to contain other raw data, such as a string, you need to fill this in explicitly:

On reading with RawMemoryImport, the data comes back in a raw format and is not interpreted more deeply. To do this, a separate step would be needed:

It is hard for RawMemoryImport to interpret data more deeply because the data may contains cycles, which would prevent returning an expression tree.

Raw Pointers and Memory Management

A key element of Wolfram Language FFI functions is working with raw memory. This section discusses how this works and how memory for this can be managed.

RawMemoryAllocate allocates memory for elements of a type. It returns a managed pointer or an array:

Information returns some useful details on the result:

A sample function for working with raw pointers is addonePointer:

The function can be called with the raw pointer being the second argument. Note that even though the raw pointer is wrapped in ManagedObject, this is stripped when the call is made:

This reads the data from the pointer.

The ManagedObject actually contains an unmanaged raw pointer. The actual raw pointer can be extracted:

Again, Information returns useful details:

The address of the raw pointer can be extracted:

The raw pointer can be passed into the foreign function:

One key advantage of managing allocations is that memory will be collected when it is no longer used by the Wolfram Language. This can be demonstrated by running a memory leak checking function, loaded from a test utility:

This runs the RawMemoryAllocate and shows that memory is not being leaked:

When you use a ManagedObject to call an FFI function, the value is extracted and passed in. The value is however still managed, so this is really just borrowing the value. If a function were to keep a handle to a borrowed value, it would not be safe if the ManagedObject that held it was freed. This is because the borrowed value would also be freed.

If you want to keep a value for longer than its ManagedObject, you can use UnmanageObject:

You can see that the ManagedObject is no longer active.

If UnmanageObject is used on a ManagedObject, the memory may be lost. The memory leak checker detects that memory is being lost for each execution:

One way to prevent the leaking of memory is to call RawMemoryFree on the unmanaged raw pointer. The memory leak checker returns True, which indicates that no memory is being lost:

It is also possible to take the unmanaged object and manage it again. This is a little strange, since you could just not call UnmanageObject in the first place. However, it does work:

The Wolfram Compiler also supports ManagedObject. In compiled code, it are useful for working with low-level memory allocations, making sure that things are freed up appropriately.

Note that memory allocated with RawMemoryAllocate that requires freeing (perhaps because it was detached from a ManagedObject) must be freed with RawMemoryFree.

Collecting from FFI Allocations

When memory allocated in a function called by FFI is returned to the Wolfram Language, another function is typically needed to collect it.

The following code contains one function to allocate memory and another to free it. In a typical application, it might be that one function returns an instance of a struct data type and another frees it (and any resources it holds):

This program can be compiled into a library with the Wolfram Language. (Note that this requires the installation of a C compiler on your machine.):

The functions for allocating and freeing memory are loaded with ForeignFunctionLoad:

This calls the function to allocate the memory:

This frees the memory.

It is sometimes tedious to always have to remember to call a freeing function. To avoid this, you can use CreateManagedObject, using the freeing function.

Now when the managed object is created, it will be freed when it is no longer used:

Note that memory allocated by a raw allocation function, such as the C function malloc, must be freed with a call to its corresponding function, such as the C function free.

Converting RawPointers

You can convert raw pointers into other raw pointers.

First, create a pointer to an "Integer64":

Write some data to the pointer:

Create another pointer that uses the same address but is considered to be a pointer to a "Real64":

The data stored in the "Integer64" pointer is as expected:

The data stored in the "Real64" pointer is a conversion of the bits in the integer to a floating-point number:

You can also convert the pointer to an integer into an "OpaqueRawPointer":

This is useful to call a function that takes a void* argument.

You can also create a raw pointer to an integer. This creates a raw pointer to an "Integer64" with the address 10. (Use of this will likely lead to a problem.):

An equivalent operation can be done for an "OpaqueRawPointer". Again, this is almost certainly a problem:

One useful operation is to create a pointer to a type and store 0:

And also for "OpaqueRawPointer":

This is useful to build a null pointer, which is used in C and certain other language.

One important aspect of generating pointers that point to the same memory location is that managed objects are set up to free up their contents appropriately. As long as one reference is still alive, the memory is not freed. This can be seen with the memory leak checker:

Working with raw pointers in this way is useful because it allows you to convert between different types of pointers and to create pointers from integers. In many languages this is called bit casting. It is of course quite dangerous; any mistake is liable to lead to a program termination or undefined behavior.

Function Pointers

You can extract the address of exported functions from a library with ForeignPointerLookup.

This returns the address of the function "addone" from the example library "compilerDemoBase":

This is its address:

You can use this pointer as a callback to another function.

You use it in ForeignFunctionLoad if you know the type of the function:

You use it in ForeignFunctionLoad if you know the type of the function:

Information also returns useful information on the ForeignFunction object:

This returns the address of the function, which matches that above:

Libraries

FFI functions that work with libraries such as ForeignFunctionLoad and ForeignPointerLookup can take the name of the library in an abbreviated form or in a full path name.

This uses the name with the library extension dropped:

You can add the library extension for your platform. On this machine it is dylib:

The FFI functions that work with libraries use FindLibrary to resolve the library name. You can call it directly; it returns an absolute name:

The absolute name can be used to load functions:

FindLibrary looks in a number of standard places, as specified by $LibraryPath:

It also looks in paclets that advertise a library extension. As an efficiency tip, if many FFI functions are being loaded, it may be faster to use FindLibrary once to create the absolute name and then use this in ForeignFunctionLoad.

Dependencies

When a library is loaded, it is necessary that any libraries that it requires are also loaded. If the system library paths have been set correctly, this will happen automatically. If not, they can be loaded first with LibraryLoad.

This loads the compilerDemoBase library:

If there are any circular dependencies in libraries, for example, two libraries that each depend on the other, then the only solution is to set up system library paths. Typically such cases are rare.

The Wolfram Compiler

The Wolfram Compiler supports types that work for FFI. For example, a raw pointer can be returned from compiled code:

The same thing can be done for "OpaqueRawPointer":

You should note that the compiler function ToRawPointer also generates a raw pointer, but one that is allocated in the stack frame of the function in which it is called. Therefore it is not a safe thing to return.

It would be possible to use compiled code to generate a type such as a "CArray" and then cast this to a raw pointer:

These types of operations can be useful to prepare and work with the FFI tools.

Managed

You can work with managed objects in the compiler. This can be useful for preparing raw data to pass to other programs but making sure things are collected.

Here an array of integers is created and then discarded:

When the code is run, there is a memory leak:

This code puts the array into a managed object:

There is no memory leak:

The syntax of managed objects in the compiler is slightly different from the FFI syntax. Future work will unify these.

Interesting Examples

Calling the OpenSSL Library

The OpenSSL library is bundled with the Wolfram Language to support secure communication. It has some interesting functions that can be called with FFI.

This returns the path to the library. It is different on different platforms, but this is the value on this platform:

This loads a function that returns the version of the library:

Call the version number function:

The OpenSSL library has a function for generating random bytes. It takes an array of bytes and the length of the array. It can be loaded as below:

Generate an array of bytes to be filled in:

Call the function:

Return the list of bytes that were generated:

Using the Wolfram Compiler to Call a Library

It is also possible to call functions from a library using the Wolfram Compiler. Typically, this requires more effort and is slower to generate the function. However, it has a simpler execution path.

First, a declaration for the function from the library is required:

Now a function that calls the library function is compiled:

When the function is called, the expected result is generated:

Using FFI on a Library Created by the Wolfram Compiler

You can generate libraries from the Wolfram Compiler and then you can call them with FFI. One convenient way to do this is to create a CompiledComponent that exports raw functions.

The following declares a component called "demoLibrary" that contains one function called square:

The following declares a raw library export for square:

Now the library for the component is built:

This shows the library that was created:

To use the library, the component has to be loaded:

Now the function can be loaded from the library with FFI:

This calls the function:

There are many ways to create libraries and call functions, including using the Wolfram Compiler and FFI. Using some mixture of all of these is a useful way to learn and experiment to find the solution that works best for you.

Using the Wolfram Compiler to Create a Callback

You can use the Wolfram Compiler to create a callback. This will provide a callback that executes faster than one written with the Wolfram interpreter.

This loads the createArray function from the compilerDemoBase library:

A callback that calls the evaluator can be created with CreateForeignCallback:

This calls the function passing in the callback:

Now you want to create the callback with the Wolfram Compiler. First, create a compiled function that has the correct type and functionality:

The actual raw function is stored in the CompiledCodeFunction:

You now need to convert the raw function pointer from the CompiledCodeFunction object into a OpaqueRawPointer:

Now you can invoke the createArray function using this callback:

The array that was returned has the expected values: