CUDALink Setup
This section is concerned with the way that
CUDALink is set up and configured for your machine. It will also help to track down and correct problems.
Core Setup and Testing
CUDALink is designed to work automatically after
Mathematica is installed with no special configuration. You can test this by using the
CUDAQ function.
This loads the
CUDALink application.
This checks if
CUDALink is supported. If it returns
True, as shown below, then
CUDALink will work.
| Out[2]= |  |
If
CUDAQ does not return
True,
CUDALink will not work. However, you might be able to configure your machine to fix this. The rest of this section describes steps you can carry out to try and enable
CUDALink.
CUDA GPU
You should first confirm that you have GPU hardware supported by CUDA. If you are not certain, you can see the list of hardware supported in the
GPU Hardware section. In addition, you should check that your
operating system is supported.
If you do not have supported hardware, you will not be able to use
CUDALink.
CUDA Driver
For
CUDALink to work, you need to have an up-to-date driver. You can check this for your system by running
CUDADriverVersion.
First, you need to load
CUDALink.
CUDADriverVersion returns information on the CUDA driver version.
| Out[2]= |  |
On Linux and Windows, the driver needs to be at least 257 for
CUDALink to work. On Mac OS X, the driver needs to be at least 3.1. If your driver is not up to date, you may be able to update it from the
NVIDIA Drivers website.
In addition, you can check the driver information directly from your computer, as described in
"Checking the NVIDIA Driver".
CUDA Resources
CUDALink automatically downloads and installs some of its functionality when you first use a
CUDALink function such as
CUDAQ. In fact, you may see information about the download of the resources (which are about 80 MB in size).
Under certain circumstances—for example, if you are not connected to the internet or have disabled
Mathematica's internet access—the download will not work. You can test whether the resources have been installed by using
CUDAResourcesInformation.
| Out[2]= |  |
If the result is an empty list, then the resources are not installed. This must be corrected before
CUDALink can work.
If your internet connectivity has been restored, you can use
CUDAResourcesInstall, as shown below, to download and install the
CUDALink resources.
| Out[3]= |  |
Alternatively, you may install the resources manually. You can download the resources from the Wolfram Research website at the
Wolfram CUDAResources web page, taking the current file for your machine.
When you have downloaded the resources, you can install them in
Mathematica with
CUDAResourcesInstall, using the path to the download file.
After this, you should be able to execute
CUDAResourcesInformation and see the installation.
Further Setup and Configuration
You can confirm that
CUDALink is working by executing
CUDAQ and seeing that this returns
True. However, some further configuration and testing may be useful, in particular if you want to use
CUDAFunctionLoad.
More detailed information on your hardware is available from the section of
SystemInformation. A sample is shown below.
| Out[2]= |  |
This gives detailed information about your hardware. For example, you can tell how many cores are available and if double-precision computations are supported. You can also see how many CUDA devices you have installed on your machine and which is the fastest.
C Compiler
If you want to run your own CUDA kernels with
CUDAFunctionLoad, you will also need a C compiler. You can confirm that a suitable C compiler is available with
CUDACCompilers.
This returns a list of suitable compilers.
| Out[2]= |  |
Updating CUDA Resources
CUDALink automatically downloads and installs some of its functionality when you first use a
CUDALink function such as
CUDAQ. In fact, you may see information about the download of the resources (which are about 100 MB in size). You can see how to test and run the installation in the section on
CUDA resources.
In addition, you can uninstall and then reinstall the CUDA resources to pick up any updates. You can do this with
CUDAResourcesUninstall and
CUDAResourcesInstall, as shown below.
The following will check the Wolfram Research website to see if an update is available.
In addition, you can check the site at the
Wolfram CUDAResources web page to see if a new version is available.
Nondefault Installation
CUDALink performs many system checks, testing to see if it is supported. The NVIDIA driver library, NVIDIA driver version, CUDA library path, CUDA library version, and the ability to load the
CUDALink runtime libraries are checked before a function such as
CUDAQ is called. If any fail, either because the file does not exist in the expected location or because the file is not of a supported version, then an error message is returned and
CUDAQ fails.
By default,
CUDALink uses special locations where the installation has taken place. However, you can set system environment variables to choose different locations. More often than not, if the system has a CUDA-capable card and the proper software is installed, then you may have installed the driver in a non-standard location. The following section details the environment variables checked.
NVIDIA_DRIVER_LIBRARY_PATH
The absolute path to the NVIDIA driver library. The library is installed by the NVIDIA driver package downloaded from the
NVIDIA driver download website.
| "Windows" | "C:\\Windows\\System32\\nvapi.dll" |
| "Windows-x86-64" | "C:\\Windows\\System32\\nvapi64.dll" |
| "Linux" | "/usr/lib/libnvidia-tls.so.*" |
| "Linux-x86-64" | "/usr/lib64/libnvidia-tls.so.*" |
| "MacOSX-x86" | "/Library/Frameworks/CUDA.framework/Versions/Current/CUDA" |
| "MacOSX-x86-64" | "/Library/Frameworks/CUDA.framework/Versions/Current/CUDA" |
Default path to the NVIDIA driver library in case
is not defined.
After detection, the result is stored in

.
CUDA_LIBRARY_PATH
The absolute path to the CUDA library. The library is installed by the NVIDIA driver package downloaded from the
NVIDIA driver download site.
| "Windows" | "C:\\Windows\\System32\\nvcuda.dll" |
| "Windows-x86-64" | "C:\\Windows\\System32\\nvcuda.dll" |
| "Linux" | "/usr/lib/libcuda.so" |
| "Linux-x86-64" | "/usr/lib64/libcuda.so" |
| "MacOSX-x86" | "/usr/local/cuda/lib/libcuda.dylib" |
| "MacOSX-x86-64" | "/usr/local/cuda/lib/libcuda.dylib" |
Default path to the CUDA Library in case
is not defined.
After detection, the result is stored in

.
Common Errors
Timeout
On some systems, the configuration limits the number of seconds you can run a GPU computation before the operating system terminates the computation. On Windows Vista and 7, the timeout is set to two seconds by default.
When computation is terminated, it may cause a brief black screen along with the following task bar popup.
Windows users can get this when computation takes more than two seconds on the GPU. Users can disable this behavior by following the
Microsoft WDDM guide.
From within
Mathematica, the registry key can be set to time out in seven seconds by running the following as administrator.
Developer`WriteRegistryKeyValues["HKEY_LOCAL_MACHINE\\System\\CurrentControlSet\\Control\\GraphicsDrivers",{"TdrDelay"→7}]
The default value is 2. Other operating systems impose similar limitations with options to disable.
For
CUDALink, a forced termination of graphics computation is equivalent to a crash.
CUDALink will thus consider the current state of the GPU to be invalid and only a
Mathematica kernel restart will reset the state.
In severe cases, a forced termination by the operating system may cause the system to hang.
No NVCC Compiler
The CUDA toolkit is automatically installed for users and is included in the

paclet. To check which version of the toolkit is installed, do the following.
| Out[2]= |  |
Functions that compile CUDA code accept the

. This option can be set to a location of an existing NVCC compiler. Here, you compile using the NVCC compiler located in C:\CUDA.
| Out[4]= |  |
No C Compiler
Since the NVCC compiler requires a C compiler, the NVCC compiler will not be detected without a valid C compiler.
CUDALink requires a supported C compiler to be installed on the system. Supported C compilers are Microsoft Visual Studio 2005 and 2008 on Windows and GCC 4.1, 4.2, or 4.3 on Linux and OS X. Mac OS X users can get the compiler from the OS X development package (which includes other tools like XCode).
If the compiler is not in a standard installation, then the
"CompilerInstallation" option can be given to use the nondefault installation.
Systems with Different GPU Manufacturers
On systems with multiple GPU devices, the NVIDIA driver must be installed last. This will allow the operating system to run the CUDA card using the proper driver. Note that some operating systems, such as Windows Vista, do not allow users to run two video cards from different manufacturers.
Laptops that perform automatic video card switching (between an Intel and NVIDIA card, for example) may interfere with
CUDALink's initialization of the CUDA device. On such systems, users can either disable the video card switching behavior, or use tools that allow the user to switch from the menu bar (search the web for "dual-GPU switch" followed by the operating system).
Using CUDALink over Remote Desktop
CUDALink will not work over the Windows Remote Desktop (RDP). Users can use other protocols like VNC to perform remote computation. Unix users can use either SSH or X tunneling.
A recommended alternative to VNC is to set up remote kernels. This allows you to get the interactivity of the front end while running the kernel on a remote CUDA machine. Information on how to set up remote kernels is found in the
documentation page.
Using CUDALink inside a Virtual Machine
Most virtual machines emulate the video card, so you cannot access the GPU for CUDA computation. Unless your virtual machine software supports GPU computation and the machine is configured to allow the virtual machine to use the GPU,
CUDALink is not supported in a virtual machine.
Multiple CUDA Devices
CUDALink is supported on multiple devices using
Mathematica's parallel tools. Worker kernels can be launched for each GPU. The maximum number of GPUs supported is limited only by the number of kernels a user can launch.
For information on using multiple devices, refer to
.
Headless Terminal
CUDALink is supported on headless terminals, although sometimes on Linux, due to system configuration, the NVIDIA devices do not get the proper permission. For more information on configuring
CUDALink for use on headless terminals, refer to
.