CUDA Functions

CUDALink is a built-in Wolfram Language package that provides a simple and powerful interface for using CUDA within the Wolfram Language's streamlined work flow.

CUDALink provides you with carefully tuned linear algebra, discrete Fourier transforms, and image processing algorithms. You can also write your own CUDALink modules with minimal effort. Using CUDALink from within the Wolfram Language gives you access to the Wolfram Language's features, including visualization, import/export, and programming capabilities.

In this section, the built-in CUDALink functions are discussed, and a handful of applications are also given.

List Processing

CUDALink list processing functions are designed to mimic the existing Wolfram Language functions, and, while less general than the Wolfram Language's implementation, they do provide the most commonly used functions. CUDALink implements the following list processing functions.

CUDAMapmap a function to each element of an input list
CUDAFoldgiven an initial value and a function , this returns
CUDAFoldListgiven an initial value and a function , this returns
CUDASortsort a given list
CUDATotalfind the total value of a given list

CUDALink's list processing functions.

The above functions can be used as any Wolfram Language functions. To use the functions above, first load the CUDALink application.

Once loaded, the above functions can be used. This maps the function Cos to a random list.

Computation can be strung together. Here you can find the total of the above list using CUDAFold (called reduction in the GPU programming field).


In many cases, the above list operators are pivotal to many algorithms. Here, a few are discussed.

Line of Sight

Given a height map, the line of sight problem finds all points on the height map visible from a single point. It does so by first transforming the height map to an angular map, and then performing a Max fold on the angular map. The results from the maxAngle list can then be easily used to determine if a point is visible or not.

This generates a sample height map.

This computes the angular map. Here you use the first point in the height map as a reference angle.

The maxAngle list is computed using CUDAFoldList.

An angular is marked as visible if angularMapi>maxAnglei.

This displays all points visible from the reference point.

Random Walk

Random walk is a common tool used in many applications, such as the analysis of Brownian motion in physics. This shows a random walk in one dimension using CUDAFoldList.

Choosing discrete random numbers, the walk can be performed on a lattice.


Histograms are commonly used in many applications to place elements in bins. Here, you can use CUDASort to simplify the histogram calculation. This sorts the input image.

Once sorted, given a value to count the number of its occurrences, you need to scan the sorted list until its value changes. To find the first element, you have to count the number of elements until the element is not equal to the first.

This resulting histogram is plotted using ListLinePlot.

Dot Product

This finds the dot product of two vectors.

Image Processing

The CUDALink Image Processing module can be classified into three categories. The first is convolution, which is optimized for CUDA. The second is morphology, which contains abilities such as erosion, dilation, opening, and closing. Finally, there are the binary operators. These are the image multiplication, division, subtraction, and addition operators. All operations work on either images or lists.

CUDAImageConvolveconvolve the kernel with the specified kernel
CUDABoxFilterconvolve the kernel with the BoxMatrix kernel
CUDAErosionperform morphological erosion
CUDADilationperform morphological dilation
CUDAOpeningperform morphological opening
CUDAClosingperform morphological closing
CUDAClampclamp the values between a range
CUDAColorNegateinvert the values of input
CUDAImageAddadd two inputs
CUDAImageSubtractsubtract two inputs
CUDAImageMultiplymultiply two inputs
CUDAImageDividedivide two inputs

CUDALink Image Base Operations.

To use any of these functions (and if not already done), include the CUDALink application.

CUDALink's image processing functions, like the Wolfram Language's, accept images as input. Here you can find the gradient of an input image.

Since the CUDA image processing functions behave like Wolfram Language functions, you can combine them with existing Wolfram Language functions. Here, you can apply CUDAImageMultiply to all combinations of a set of images.

The CUDA image processing functions work with the Wolfram Language's dynamic evaluators, such as Manipulate, Dynamic, and Animate. Here, you can use Animate to create an animation of how an image behaves as it is convolved with different GaussianMatrix radius sizes.


Creating New Image Processing Operators

CUDALink's image processing operators are building blocks to more complicated operators. Here, you can define the CUDADarker operator, which is similar to the Darker operator in the Wolfram Language.

The function can then be used.

Input Smoothing

Many algorithms require the input to be smoothed before processing. This defines a random input list.

This plots the results, showing the input to be very noisy.

You can use the fact that the image processing functions also operate on lists to smooth out the input list.

Geographical Data Processing

Since all image processing functions are also list processing functions, you can process any data that can be represented by a Wolfram Language list. In this example, you can use CUDAClamp to process geographic elevation data by clamping values in the elevation map.

This loads the data from the Wolfram servers.

This creates an interface that allows the user to vary the clamp parameters.

Acquired Image Processing

The following example requires a web camera. CurrentImage returns an error if no camera is detected.

This creates an interface where the user can process input images from the web camera in real time.

Linear Algebra and Fourier Transforms

CUDALink provides specialized data types that are download data to the GPU and support some basic linear algebra operations to be carried out with CUDA enhanced functions. In addition, these operations also work with general lists and CUDAMemory objects. Typically the new data types are preferred since they automatically reclaim their memory when the expression is no longer used, also they have a simpler operation since they always reside on the GPU. However, they do not yet have quite such a wide support as general lists or CUDAMemory objects.

CUDAVectora vector of data that resides on the GPU
CUDAMatrixa matrix of data that resides on the GPU
CUDASparseVectora sparse vector of data that resides on the GPU
CUDASparseMatrixa sparse matrix of data that resides on the GPU

Data types that can work with data stored on a CUDA enabled GPU.

CUDADotgive product of vectors and matrices
CUDAArgMaxListgive the index with maximum absolute element
CUDAArgMinListgive the index with minimum absolute element
CUDAFourierfind the Fourier transform
CUDAInverseFourierfind the inverse Fourier transform

Linear algebra and Fourier transform operations using CUDA.

If not done so already, import the CUDALink application.

Here, a matrix of data that lives on the GPU is created.

This multiplies the two vectors.

This extracts the data returning a NumericArray.

The data stored can be seen with another application of Normal.

Operations on the GPU can be significantly faster. This creates a large matrix and also makes a GPU version.

This carries out matrix multiplication on the CPU.

The GPU version is much faster.

Another example of faster operations comes from CUDAFourier used on a CUDAVector.

This works on the CPU data.

The GPU version is much faster.

Typically operations on the GPU are faster than on the CPU for large amounts of data.


Linear algebra and Fourier analysis have many applications that are beyond the scope of this tutorial. Here is a simple example of the kind of operations made possible by these CUDALink features.

Image Transformation

This transposes an input image.


Along with these useful functions, CUDALink bundles many examples that showcase the capabilities of programming with CUDALink. The source of these examples is bundled with the Wolfram System.

CUDAFluidDynamicscompute and render a fluid dynamics simulation
CUDAVolumetricRenderrender volumetric data

Example applications of CUDALink.

Fluid Dynamics

This approximates the solution of the NavierStokes equations on a torus.

Volumetric Render

This reads in the dataset.

This renders the data by ray tracing the voxels.