Programming OpenCL in Mathematica is simple since the user need not write C wrapper code—which can be quite verbose, difficult to understand, and hard to debug. Using OpenCLLink also guarantees compatibility as new versions of the standard are released.

In this section a brief introduction is given to OpenCL programming. The section uses OpenCLFunctionLoad, which allows users to load OpenCL code and use it from within Mathematica.

 OpenCLFunctionLoad load an OpenCL function from source into Mathematica

OpenCL programming in Mathematica.

Users are advised to read the much more detailed tutorial "CUDALink Programming".

## OpenCL Programming

An OpenCL program is a small piece of code that performs a computation on each element of an input list. This first program will add 2 to each element.

`__kernel void addTwo_kernel(__global mint * arry, mint len) {        int index = get_global_id(0);    if (index >= len) return;        arry[index] += 2;}`

The following is the decomposition of the above program into sections.

_kernel void addTwo_kernel (_global int arry, int len) {

The _kernel construct declares the function to be run on the OpenCL GPU. The rest are function arguments with pointers having the _global prefix.

int index = get_global_id (0);

This gets the index value of the thread executing the function. The index values range from 0 to the number of threads launched.

if (index >= len) return;

This makes sure that the program does not write to memory beyond the length of the input array. Since the number of threads launched is in multiples of the block size, this conditional statement is needed if the size of the input array is not a multiple of the block size.

arry[index] += 2;

This adds two to each element.

Once the program is written, it can be loaded into Mathematica using OpenCLLink. This is done using OpenCLFunctionLoad.

First, assign the program to a string.

This loads the function. The arguments to OpenCLFunctionLoad are the source code, the name of the function to load, the function signature, and the block dimension.

 Out[3]=

This runs the function.

 Out[4]=

The result is the set of output list elements.

## Porting CUDA to OpenCL

Since OpenCLLink handles the C wrapper code required in OpenCL programming, allowing the user to concentrate on the OpenCL kernel code, this is the only code needed to be ported from CUDA.

In terms of OpenCL program porting, there are one-to-one function renames between CUDA and OpenCL. The following table gives the correspondence.

In this section, use the above table to port the following CUDA code to OpenCL.

`__global__ void myKernel(mint * global0Id, mint * global1Id, mint width, mint height) {    int xIndex = threadIdx.x + blockDim.x * blockIdx.x;    int yIndex = threadIdx.y + blockDim.y * blockIdx.y;    int index = xIndex + yIndex*width;    if (xIndex < width && yIndex < height) {       global0Id[index] = threadIdx.x;       global1Id[index] = threadIdx.y;    }}`

The following is the translation of the CUDA code into OpenCL.

The following was changed when porting:

• _global_voidmyKernel became _kernel voidmyKernel
• int global0Id, int global1Id became _global int global0Id, _global int global1Id
• threadIdx.x + blockDim.x blockIdx.x became get_global_id (0)
• threadIdx.y + blockDim.y blockIdx.y became get_global_id (1)
• threadIdx.x became get_local_id (0)
• threadIdx.y became get_local_id (1)

In terms of loading the code, the only change that is needed is to replace CUDAFunctionLoad with OpenCLFunctionLoad.

 Out[6]=

This sets the input parameters.

This launches the function.

This visualizes the results.

 Out[12]=

### Terminology

Users should note that there are some differences in terminology between CUDA and OpenCL. In Mathematica, the best description of both is combined.

The following table gives the translation in terminology:

 CUDA OpenCL streaming multiprocessor device multiprocessor compute unit global memory global memory shared memory local memory local memory private memory kernel program block work group thread work item

## Memory

The behavior of the memory manager is the same between CUDALink and OpenCLLink. The memory manager is discussed in depth in the CUDALink Memory Guide.

Since memory is bound to one link, CUDALink and OpenCLLink memories are not interchangeable.

## Related TutorialsRelated Tutorials

New to Mathematica? Find your learning path »
Have a question? Ask support »