Legacy Documentation

Parallel Computing Toolkit (2000)

This is documentation for an obsolete product.
Current products and services
Previous section-----Next section

Introduction

Parallel Computation with Mathematica

The MathLink communication protocol can be used to control several Mathematica kernel processes from within Mathematica. This feature allows the implementation of a distributed-memory environment for parallel programming. Parallel language constructs, such as a parallel version of Map, can easily be implemented on top of these primitive operations.
Parallel Computing Toolkit is written entirely in Mathematica and is therefore machine-independent. It has been tested on Unix, Linux, Windows, and Macintosh platforms. This product can be used in heterogeneous networks. All client and application code is distributed through MathLink. No common file system is necessary.

To perform computations in parallel, you need to be able to perform the following tasks:
  • start processes and wait for processes to finish
  • schedule processes on available processors
  • exchange data between processes and synchronize access to common resources
In the Mathematica environment, the term processor refers to a running Mathematica kernel, whereas a job or process is an expression to be evaluated.

Parallel Computing Toolkit Features

The main features of PCT are:
  • distributed memory, master/slave parallelism
  • written in Mathematica
  • machine independent
  • MathLink communication with remote kernels
  • exchange of symbolic expressions and programs with remote kernels, not only numbers and arrays
  • heterogeneous network, multiprocessor machines, LAN and WAN
  • virtual process scheduling or explicit process distribution to available processors
  • virtual shared memory, synchronization, locking
  • latency hiding
  • parallel functional programming and automatic parallelization support
  • failure recovery, automatic reassignment of stranded processes on failed remote computers

Requirements

To use Parallel Computing Toolkit, you need access to a number of remote computers capable of running Mathematica or use of a multiprocessor local machine, a suitable network connection between your local computer and the remote machines, and the required number of Mathematica licenses. Note that even if a network is set up, there may be security restrictions that limit your ability to start Mathematica on remote computers.
To start Mathematica on a remote computer, the remote computer must run a rsh/ssh daemon or other remote login or execution service. The chapter Starting Remote Kernels contains detailed discussions of the various available options.
An alternative approach that works on any computer equipped with a TCP/IP network, even without a rsh daemon, is to manually start the desired kernels on each remote machine and then connect to the waiting kernels from the local machine.

Overview of Remote Execution

The method used to start remote kernels depends on both the operating system of your local computer and the types of remote computers you use. You can start kernels on remote computers that have an operating system different from the one you are using locally.
This section covers typical Parallel Computing Toolkit commands you would use to start remote kernels on Windows, Mac OS X, or Unix systems. The chapter Starting Remote Kernels will describe how to start kernels manually and provide details on the commands presented in this section.
Parallel Computing Toolkit provides a high-level command LaunchSlave for connecting to and starting kernels on remote computers. The command has the following general form.
The variable remotehost is the name of the remote computer on which you will start a kernel. On a local network, this can be a simple hostname. On a wide-area network, this would typically be a domain name, such as host.example.com. If you have a multiprocessor and can therefore start kernels on your local machine, use "localhost" rather than the computer name.
The oscommands argument is passed to the command interpreter on your computer; its form depends on your operating system. This argument can be a series of commands that start a kernel. Certain values are interpolated into the command string to make this feature more general. Typical oscommands are:
  • ssh: The name of your local ssh client command. This command is used to establish a secure connection to a remote computer. ssh is provided with most versions of Unix, and it is available as third-party software for Windows.
  • rsh: The name of your local rsh client command. It works similar to ssh using a widely supported standard protocol, but provides only minimal security features.
  • math: The name of the command on the remote computer to start the Mathematica kernel. You may have to give a full pathname such as /usr/local/bin/math.
  • $mathkernel: The full pathname of the command used to start a local kernel.
A remote host may require your login name before you can establish a connection. In this case username, your login name on the remote computer, will be part of the second argument of LaunchSlave.
Before running any commands, load the Parallel Computing Toolkit main package into your local Mathematica session with the Needs command.
In[1]:=

Working on a Unix or Macintosh Computer

To connect to a remote computer running Unix or Mac OS X and start a Mathematica kernel there, use
This command uses the value of the variable $RemoteCommand as the default oscommands argument. The slots `1` through `4` are replaced by values such as remote hostname, linkname, login name, and linkprotocol.
In[3]:=
Out[3]=
If this is not appropriate for a particular remote host, you can supply your own custom command.
To connect to your own local machine and start a kernel there (recommended for testing and if you have a multiprocessor machine), use the following command.
In[4]:=
Out[4]=

Working on Windows

Please note that establishing connections to Windows requires third-party software (some of which is available for free) and special installation. Please refer to the detailed discussion in the chapter Starting Remote Kernels. You can, however, easily establish connections from your local Windows PC.
To connect to a remote computer with a rsh daemon, use
This command uses the value of the variable $RemoteCommand as the default oscommands argument. The slots `1` through `4` are replaced by values such as remote hostname, linkname, login name, and linkprotocol.
In[3]:=
Out[3]=
If this is not appropriate for a particular remote host, you can supply your own custom command.
To connect to your own local machine and start a kernel there, use the following command.
In[4]:=
Out[4]=

Simple Parallel Computations

Once you have successfully started at least one remote kernel, you can begin to use Parallel Computing Toolkit.
First, you can ask each remote kernel to identify itself. The result is a list of each remote kernel's unique ID, the remote host's name, Mathematica's identifier for the remote operating system, the remote kernel's process ID, and the Mathematica version running on the remote computer.
In[6]:=
Out[6]//TableForm=
You can try to run the same Mathematica command on all remote computers. Normally, all the results returned should agree. Here a definite integration is performed on each of the three remote kernels.
In[7]:=
Out[7]=
Here four definite integrals with different lower bounds are computed in parallel.
In[8]:=
Out[8]=
The remaining chapters of the tutorial will provide many more examples of typical parallel computations you can perform with the help of Parallel Computing Toolkit.

Cleaning Up

When you have completed your parallel computations, you should stop all remote kernels before exiting your local Mathematica kernel and front end.
In[9]:=
Out[9]=