Legacy Documentation

Parallel Computing Toolkit (2000)

This is documentation for an obsolete product.
Current products and services
Previous section-----Next section

Starting Remote Kernels

Configurations of local computing infrastructure vary widely from site to site. Unfortunately, there is no simple way to accommodate all possible setups. This chapter describes the various configurations in detail, so the content is fairly technical. Your local system administrator may be able to help you set up remote connections.
The way to start a remote (slave) kernel depends on the operating systems of the local and remote machines, the properties of the network, and the security measures in effect. Note that you can also start slave kernels on the local machine where the master kernel is running. This is particularly useful for testing and on multiprocessor machines.

MathLink Communication Modes

Parallel Computing Toolkit uses MathLink to communicate with remote kernels. Once a connection has been established, it is used for any further communication with the remote kernel. The MathLink connection provides a machine-independent channel for Mathematica expressions between the controlling (master) kernel and the remote (slave) kernels.
In general, establishing a connection to a remote kernel requires two steps. First, the remote kernel must be started, then it must be instructed to establish a MathLink connection to the master kernel. Both of these tasks can be performed with the command LaunchSlave[]. Depending on its arguments, it uses various MathLink commands to achieve the result.

Active Connection (LinkLaunch)

An active connection is initiated from the master kernel by using the MathLink function LinkLaunch["oscommands", options]. The argument oscommands is an operating system command that makes a connection to a remote machine and starts a Mathematica kernel on that remote machine.
LinkLaunch[] is used by default for launching slave kernels on the local machine in the form LaunchSlave["localhost",options]. The PCT command LaunchSlave["remotehost","oscommands", ConnectionType→LinkLaunch] also uses the active connection method.

Callback Connection (LinkCreate)

For kernels on remote machines it is generally better to establish separate MathLink connections, rather than using the command channel opened by LinkLaunch. The master kernel opens a MathLink link in Listen mode using LinkCreate[], then the remote kernel is instructed to connect to the listening link.
The following command creates a link to which remote kernels can connect.
LaunchSlave["remotehost", "oscommands", options]
The oscommands is usually a template that may contain the sequences `1`, `2`, `3`, and `4`, which are replaced by values computed by the code in LaunchSlave. `1` is replaced by the hostname, `2` by the name of the link created, `3` by the user name, and `4` by the MathLink linkprotocol specification for nondefault protocols. Examples for the use of these placeholders is given in the examples that follow.

Passive Connection

Active or callback connections may not be available because of operating system deficiencies or security measures. If this is the case, another method for establishing a connection is available. You can manually start a kernel on a remote machine and instruct it to open a TCPIP port on which to listen for connection requests. This is usually achieved by providing the command-line arguments -mathlink -linkcreate to the command to start the kernel, usually math. (Under Windows, add -linkprotocol TCPIP.) The started kernel will tell you the ports on which it is listening.
In the master kernel you can make a connection to a listening kernel with the following MathLink command
LinkConnect["port1@hostname,port2@hostname"]
The Parallel Computing Toolkit command ConnectSlave["port1@hostname,port2@hostname"] will connect to a listening remote kernel.

Link Objects

The result of a successful LinkLaunch or LinkConnect connection is a MathLink link object having the following form.
The name is taken from the argument of LinkLaunch or LinkConnect and allows you to identify the object. Parallel Computing Toolkit keeps track of the available remote kernels by maintaining a list of such link objects in the variable $Slaves. To obtain the raw link object from a remote kernel object, use LinkObject[kernel].

Remote Execution Options

To use active or callback connections, you need a way to execute a command to start Mathematica on the remote computer.
The available methods for remote command execution depend on the operating system of the master and slave machines. The network applications rsh and ssh are standard under Unix. Under Windows you can use any rsh program that may be provided with the system or available from a number of sources.
Your local machine, from which you want to initiate a connection, needs an ssh or rsh client program; the remote machine needs a corresponding daemon.

Remote Execution under Unix and Mac OS X

To make a connection from your local Unix or Mac OS X machine you can use the rsh or ssh programs.
The shell ssh is a replacement for rsh that offers secure cryptographic authentication and encryption of the communication between the local and remote machine. It is, therefore, usable in situations where the rsh security is insufficient such as on the internet. If your site is requiring ssh, please contact your system administrator about your local setup. Parallel Computing Toolkit has been tested with Version 2 of ssh.

Using ssh

To test whether ssh is configured correctly, the following command can be given in a shell window.
Here, remotehost is the name of the remote machine and math is the command to start a Mathematica kernel on the remote machine. If the remote machine is outside of the local area network, then remotehost must be a fully qualified domain name. If the math command is not on the search path, the full pathname can be given instead, for example /usr/local/bin/math.
It is a good idea to try to establish a connection in a shell window to see whether everything is set up correctly before trying to use the given remote host in Parallel Computing Toolkit. If everything is fine, the remote kernel should print the familiar In[1]:= prompt. You can then use Quit[] to terminate the remote kernel and the connection to the remote machine.
Once ssh is working, you can use the following command to start a kernel on a remote Unix machine.
The placeholder `1` is replaced by the remote host name remotehost, `2` is replaced by the link specification of the link created by LinkCreate[]. The resulting command is then executed by the operating system. LaunchSlave[] supports a number of additional placeholders to accommodate more complicated situation, see the section Configuring the Parallel Computing Toolkit.
If you leave out the second argument, the following default is used.
For an active connection that uses standard input and output as MathLink transport, use the following command.
The placeholder `1` is replaced by the remote host name remotehost. The resulting command is then executed by the operating system.
You may have to prefix the remote kernel command math with the appropriate pathname on the remote machine, such as /usr/local/bin/math.

rsh

The following shell command starts an interactive Mathematica kernel on a remote machine.
Here, remotehost is the name of the remote machine and math is the command to start a Mathematica kernel. If the remote machine is outside of the local area network, then remotehost must be a fully qualified domain name. If the math command is not on the search path, the full path name can be given instead, for example /usr/local/bin/math.
It is a good idea to try to establish a connection in a shell window to see whether everything is set up correctly before trying to use the given remote host in Parallel Computing Toolkit. If everything is fine, the remote kernel should print the familiar In[1]:= prompt. You can then use Quit[] to terminate the remote kernel and the connection to the remote machine.
Once rsh is working, you can use the following Toolkit command to start a kernel on a remote Unix machine, using a callback connection.
The placeholder `1` is replaced by the remote host name remotehost, `2` is replaced by the link specification of the link created by LinkCreate[]. The resulting command is then executed by the operating system.
For an active connection that uses standard input and output as MathLink transport, use the following command.
The placeholder `1` is replaced by the remote host name remotehost. The resulting command is then executed by the operating system.
You may have to prefix the remote kernel command math with the appropriate pathname, such as /usr/local/bin/math.
To start a kernel on a remote Windows machine, the remote machine must have an rsh daemon running. You can start a remote kernel on a Windows machine from a Unix host by using commands similar to those explained in the section Remote Execution under Windows. For example, issue the following command on a local Unix machine to start a remote kernel on a remote Windows host.

Security considerations

You can only use rsh if you are allowed to log into the remote machine without a password. For this to work, your local machine must be in the remote machine's /etc/hosts.equiv or ~/.rhosts file. See the Unix Manual for rlogin and rsh for more details and consult your system administrator.

Starting kernels on your local machine

For testing, and if you have a multiprocessor machine available, you can also start kernels on your local machine where you operate the master kernel and the front end.
This command uses the value of the variable $mathkernel as the command to launch a kernel. It should be set up suitably for your Mathematica installation.
A typical value for Unix is shown here.
Here is a typical value for Mac OS X.

Remote machines running Mac OS X

No math script is installed on Mac OS X. To launch a remote kernel on a Mac OS X machine, you can either give the full pathname of the MathKernel executable or write your own math script; see the support pages support.wolfram.com/applicationpacks/parallel/ for more information.
The kernel is typically at $InstallationDirectory/Contents/MacOS/MathKernel.
The Mathematica 5.2 pathname contains space characters, so the pathname needs to be enclosed in double quotes and the space escaped by a backslash. Here is an example command to launch a Mathematica 5.2 kernel on a Mac OS X machine.

Remote Execution under Windows

Under Windows you can use any available rsh program to start remote kernels. The remote kernel can then establish a TCPIP connection back to the local kernel. This is most easily done by creating a MathLink object locally to which the remote kernel can establish a callback connection.
On the remote end, the Mathematica kernel command-line arguments -linkmode Connect -linkprotocol TCPIP -linkname port1@host,port2@host instruct the kernel to connect to an open port on your local machine, host. The Toolkit will provide the port1@host,port2@host argument for you. Use `2` to interpolate it into the command string.
You need a rsh daemon for all remote machines. Note that Mathematica does not provide an rsh daemon.
If your version of Windows includes rsh, you can use these arguments to LaunchSlave to make connections to remote hosts.
This command uses the following template for starting the remote kernel:
The first placeholder `1` will be replaced by the hostname remotehost as usual; the second placeholder `2` will be replaced by a link object created on the local machine to which the remote kernel can connect. The placeholder `3` is replaced by the username. If the remote host does not require a login user name, omit the -l `3` option. The placeholder `4` is replaced by the correct -linkprotocol setting.
You may have to prefix the remote kernel command math with the appropriate pathname, such as /usr/local/bin/math for a remote Unix system. Be sure to give the correct user name for connecting to the remote machine.

Available third-party software

A list of third-party rsh daemons and clients can be found at support.wolfram.com/applicationpacks/parallel/. The author, MathConsult AG, and Wolfram Research, Inc. do not endorse any of the products listed at that URL. We provide this information in the hope that it may be useful.
If you have an ssh client installed on Windows, you may be able to connect to a remote Unix or Mac OS X slave using ssh. Neither Parallel Computing Toolkit nor Mathematica supply an ssh client for Windows, but there are several available commercially. You can then use the following command to launch remote kernels:
The exact arguments needed may vary with network and machine configurations.

Starting a kernel on a local Windows machine

For testing and in multiprocessor machines, you can conveniently start a kernel on the local machine. Use this command.
This command uses the value of the variable $mathkernel as the command to launch a kernel. It should be set up suitably for your Mathematica installation.

Remote machines running Mac OS X

No math script is installed on Mac OS X. To launch a remote kernel on a Mac OS X machine, you can either give the full pathname of the MathKernel executable or write your own math script; see the support pages support.wolfram.com/applicationpacks/parallel/ for more information.
The kernel is typically at $InstallationDirectory/Contents/MacOS/MathKernel.
The Mathematica 5.2 pathname contains space characters, so the pathname needs to be enclosed in double quotes and the space escaped by a backslash. Here is an example command to launch a Mathematica 5.2 kernel on a Mac OS X machine.

Passive Connections

If your local computer does not provide an rsh client or the remote computer does not provide an rsh daemon, you have to start the required remote kernels manually on each remote computer.
These command-line options should be given to the kernel command on the remote machines.
Under Windows, you should add -linkprotocol TCPIP.
The kernel will start up and tell you the address or linkname where it is listening. Addresses have the form port@host or port1@host,port2@host, where port is a TCP port number (a decimal integer) and host is the computer's name.
With this information, you can establish a connection from your local kernel to the remote one with the following command.
Under Windows, you should include the option setting LinkProtocol->"TCPIP".
Alternatively, ConnectSlave can take an already established MathLink link object as its argument.
Note that you may be able to use a Telnet application to log in to a remote computer, so you can give the commands described in the following section from your local machine. Otherwise, you will have to enter the command at the computer's console.

Unix and Mac OS X Remote Computers

To start a kernel on Unix, give the following command in a shell or Telnet window.
Mathematica will output the listening ports on standard output. If math is not on your search path, give an absolute path name, such as /usr/local/bin/math.
Once you receive the listening port information, you can connect to the waiting remote kernel from your local computer with the command ConnectSlave["linkname",LinkProtocol->"TCPIP"] (you can omit the option setting on a Unix local computer).

Windows Remote Computer

To start a kernel on Windows, give the following command in an MS-DOS window on the remote computer.
Mathematica will open a small panel that displays the ports on which it is listening. You must close this panel before the connection can be used.
Once you have the listening port information, connect to the waiting remote kernel from your local computer with the command ConnectSlave["linkname",LinkProtocol->"TCPIP"].

Using a Running Kernel

You can also prepare a running kernel as a remote kernel for parallel computations. Start the kernel by double-clicking on the Mathkernel (not Mathematica) icon or starting math in a shell window. You can then create the required link from within the Mathematica kernel as follows.
1. Launch (double-click) MathKernel. Do not launch Mathematica! A window with the prompt In[1]:= appears.
2. At the In[1]:= prompt, give the following command shown here with the expected form of the result.
In[1]:=
3. Take note of the port numbers that appears in place of port.
4. At the next input prompt, In[2]:=, give the following command. No output will be produced.
In[2]:=
With this information, you can now connect to the waiting remote kernel from your local computer with the command ConnectSlave["port1@host,port2@host"].

Configuring Parallel Computing Toolkit

This section lists the commands available in PCT and shows you how you can prepare a configuration file to automate the task of starting the remote kernels that are usually available to you.

Launching Remote Kernels

To start a remote kernel and add it to the list of available slave processors, use the command LaunchSlave.
LaunchSlave["remotehost", "oscommands"]
use the operating system (shell) command oscommands to start a kernel on a remote machine named remotehost and have it connect to a link created on the local machine
LaunchSlave["remotehost", "oscommands", ConnectionType->LinkLaunch]
use the operating system (shell) command oscommands to start a kernel on a remote machine named remotehost using LinkLaunch (no separate MathLink connections)
LaunchSlave["localhost", "oscommands"]
use the operating system (shell) command oscommands to start a kernel on the local machine (using LinkLaunch)
LaunchSlave["localhost"]
use the operating system (shell) command stored in $mathkernel to start a kernel on the local machine
LaunchSlave["remotehost"]
use the operating system (shell) command stored in $RemoteCommand to start a kernel
$ProcessorIDa unique integer assigned to each remote kernel (Numbers are assigned starting with 1.)

Starting remote kernels.

The argument oscommands can be a template containing the character sequences `1` for connection type LinkLaunch and `1` through `4` for connection type LinkConnect.
placeholdermeaning
`1`host namethe remote host name, the first argument of LinkLaunch.
`2`link namethe name of the link object created
`3`remote userthe user name on the remote machine, the value of $RemoteUserName, which defaults to $UserName.
`4`protocola suitable -linkprotocol proto setting for the MathLink argument list

Placeholders in operating system command templates.

If there is no LinkProtocol->"proto" setting in LaunchSlave, the placeholder `4` expands to the empty string for local connections (to use the native default protocol) and to -linkprotocol TCPIP for remote connections. If an explicit LinkProtocol->"proto" setting exists, `4` expands to -linkprotocol proto.
The following options can be given in LaunchSlave.
option namedefault value
InitCode$InitCodea sequence of commands (wrapped inside Hold) to send to each remote kernel upon startup
ConnectionTypeAutomaticthe MathLink connection type, which can be either LinkLaunch or LinkCreate; by default, LinkLaunch is used for local kernels, and LinkCreate is used for remote kernels
LinkProtocolAutomaticthis option is passed on to LinkLaunch or LinkCreate. by default, "TCPIP" is used for remote kernels, and no setting is used for local kernels
LinkHost""this option is passed on to LinkCreate. It can be used to specify the interface on which the link is listening.
ProcessorSpeed        1an estimate of the relative speed of the remote kernel

Options of LaunchSlave.

The default value of the variable $InitCode is Hold[$DisplayFunction=Identity;].
These options can also be given to ConnectSlave[].
If all or most of your remote hosts can be reached with the same command, you can set $RemoteCommand to a suitable command template that is used by default in LaunchSlave.
For Unix and Mac OS X, the default value is
One requirement is that the command return quickly, even though Mathematica keeps running. If it does not return, you can put the command into the background with a setting like the following
Under Windows, $RemoteCommand is set by default to
If $RemoteCommand is set up correctly, you can simply use the following commands to start a kernel on remote hosts named, for example, host1 and host2.
To connect to your local machine (recommended for testing and if you have a multiprocessor machine), you should be able to use
You may want to verify that $mathkernel contains the appropriate command for invoking a local kernel by evaluating $mathkernel.

Using Passive Connections

For passive connections, you should manually start the remote kernels with the -linkcreate argument as described earlier, note the ports on which the remote kernels are listening, and use ConnectSlave for each remote kernel.
ConnectSlave["linkname"]connect to a listening link on the given computer
ConnectSlave[link]connect to an existing MathLink object

Connecting to listening kernels.

Port numbers will usually be different each time you start a remote kernel; therefore, this method cannot easily be automated.

Preparing a Host Description List

To automate the task of starting remote kernels, you can prepare a list of available machines.
RemoteMachine["remotehost", "oscommands"]
a host description for a computer named remotehost using the command oscommands for connection and using the default connection type defined for LaunchSlave
RemoteMachine["remotehost"]
host description for a computer named remotehost using the default command $RemoteCommand for connection
"remotehost"simple host name; shortcut for RemoteMachine["remotehost"]
$RemoteCommandthe default oscommands to use
RemoteMachine["localhost"]
host description for the local machine
$mathkernelthe default command to start a local kernel

Host description entries.

The command LaunchSlaves[list] takes a list of such host descriptions as an argument and tries to establish a connection to each of the hosts listed.
Finally, you can assign a list of host descriptions to the global variable $AvailableMachines and use LaunchSlaves[] without an argument, which will consult this variable.

Defining a Default Configuration

Note that you can put assignments for $AvailableMachines and $RemoteCommand into your personal Mathematica kernel startup file init.m.
You do not need to load Parallel Computing Toolkit to define a default configuration. There is a smaller package Parallel`Configuration` that you can load instead.
Alternatively, you can put assignments for $AvailableMachines and $RemoteCommand into a notebook and evaluate them to set up the connections. You can use one of the samples here as a template. Copy the appropriate cell group into a new notebook and save it under a name such as UnixInit.nb or WindowsInit.nb. Then you can simply open this notebook and evaluate its cells to set up your remote kernels.

Sample configuration for Windows

Some of the input cells in this template have been made inactive (not evaluatable), because they contain commands for optional features. Enable these cells on a case-by-case basis according to your needs. These commands will evaluate properly only if you have access to a Windows machine on which to start a kernel and you substitute valid values for variable arguments.
Load the Parallel Computing Toolkit package.
Set your default remote command if the default is not suitable.
Set the default remote username if it is different from your local user name.
If you have ssh available, you can set $RemoteCommand to use ssh.
Set the default initialization for your remote kernels.
List any normally available machines, filling in the hostname variable in each entry.
Now you can try to start a remote kernel on all defined remote machines.
You can also put LaunchSlave or ConnectSlave commands for special cases here and evaluate them as needed.
Start a kernel on the local machine.
Connect to a manually started remote kernel.
Now verify that all remote kernels are operating correctly by collecting information about them.
After finishing your computations, you should close all connections.

Sample configuration for Unix and Mac OS X

Some of the input cells in this template have been made inactive (not evaluatable), because they contain commands for optional features. Enable these cells on a case-by-case basis according to your needs. These commands will evaluate properly only if you have access to a Unix machine on which to start a kernel and you substitute valid values for variable arguments.
Load the Parallel Computing Toolkit package.
Set your default $RemoteCommand.
Set the default remote username if it is different from your local user name.
Set the default initialization for your remote kernels.
List any normally available machines.
Now you can try to start a remote kernel on all defined remote machines.
You can also put LaunchSlave or ConnectSlave commands for special cases here and evaluate them as needed.
Start a kernel on the local machine.
Connect to a manually started remote kernel.
Now verify that all remote kernels are operating correctly by collecting information about them.
After finishing your computations, close all connections.

Kernel Initialization

To prevent the execution of the initialization commands you may have put into your init.m file, add the argument -noinit to any kernel invocation command. This is recommended unless you have put specific commands for initializing remote kernels into init.m.
You can put your remote kernel initialization commands into the PCT variable $InitCode.

Housekeeping

The list of available remote kernels is given in $Slaves. This is a read-only variable that contains the active kernel objects you have previously opened with LaunchSlaves, LaunchSlave, or ConnectSlave.
Length[$Slaves] gives you the number of currently connected remote machines or the degree of parallelism.

The Remote Kernel Object

The properties of the remote kernel objects can be obtained with these functions.
ProcessorID[kernel]a unique integer assigned to each kernel
ProcessorName[kernel]the name of the machine on which the kernel is running
ProcessorSpeed[kernel]an estimate of the relative of the remote processor
LinkObject[kernel]the raw MathLink LinkObject that connects to the remote kernel

Host description entries.

To get a nicely formatted listing of properties of the remote kernel connections, use this command. The command is followed by output from a sample session.

Remote Properties

The variable $ProcessorID is set on each remote kernel to its own processor ID.
To get a nicely formatted listing of this and other standard properties of the remote kernels, use this command. The command is followed by output from a sample session.

Troubleshooting

If you get an error message and the result $Failed when using LaunchSlave, the connection could not be established. There are a number of reasons this can happen:
  • The remote computer cannot be reached over the network, or you do not have sufficient privileges to execute remote commands on the computer.
  • The remote computer does not run an ssh or rsh daemon. Such daemons are standard under Unix and Mac OS X and available as third-party products under Windows.
  • Mathematica may not be installed correctly on the remote computer, the math command may not be on your search path, or you do not have a sufficient number of Mathematica licenses.
  • Your remote execution command on Windows has exceeded the low, arbitrary limit on command length that Microsoft imposes on command execution. Please refer to the section Remote Execution under Windows for more details. In most cases, Parallel Computing Toolkit will tell you that running the command has failed with exit code -1.
You can still continue to use any remote kernels that you could launch correctly; failed connections will never be used by the Toolkit.
The variable $Slaves gives the current list of remote connections that started up normally. If there is at least one, you can continue to work with this package. Evaluating the expression $Slaves will return the value of this variable.
To diagnose network problems, you can use the netstat operating system command in a Unix shell or MS-DOS window. You should try command-line arguments to find which will work on your operating system; most likely it will be one of the following.
The output of netstat will list existing TCP connections to remote computers. Each remote kernel will occupy one or two such TCP connections.

Tracing MathLink Commands

With debugging and tracing enabled, LinkLaunch[] will show you which MathLink commands it runs to establish a connection with a remote kernel. To use these features, you have to load the debugging package before loading the toolkit itself.
In[1]:=
In[2]:=
Now you can enable MathLink tracing.
In[3]:=
Out[3]=

Sample trace of a callback connection

This sample output shows how a default connection to a remote host is established. The output shown is
  • the LinkCreate[] command used to establish a listening link on the local machine.
  • the resulting link created
  • the command run to start the remote kernel with all placeholders filled in
  • the exit code of this command (should be 0)
  • the remote kernel object of the new kernel
In[4]:=
Out[4]=

Sample trace of a LinkLaunch connection

This sample output shows how a default connection to the local host is established. The output shown is
  • the LinkLaunch[] command used to start the kernel, containing the operating system command to start the kernel itself and the LinkLaunch[] options used
  • the resulting link created
  • the remote kernel object of the new kernel
In[5]:=
Out[5]=
To turn off tracing when you are done, use
In[6]:=
Out[6]=

Resetting and Terminating Remote Kernels

Resetting Kernels

After aborting the master kernel during a parallel computation and if remote kernels do not respond, you can try to reset them and bring them back into a usable state.
Abort[kernel]    abort a kernel (interrupt any running evaluations)
ResetSlaves[]discards any processes in the queue and aborts all running evaluations

Abort and reset kernels.

ResetSlaves can be used after a parallel computation has been aborted with the menu command Kernel Abort Evaluation, or ..
For some remote kernel connections, notably for kernels started on remote machines using LinkLaunch, there may be no way to interrupt them. If a remote evaluation takes too long or is in an infinite loop, you must terminate the remote kernel process using the appropriate operating system command.

Clearing Kernels

Between different parallel computations, you may want to make sure that all remote kernels delete any variable definitions that may have been set. Rather than terminating and restarting all kernels, you can use ClearSlaves.
ClearSlaves[]clears all variables in the remote kernel's Global` context and forgets any shared variables and exported environments

Clearing definitions.

Definitions for symbols in contexts other than Global` are not cleared.
Any definitions of global symbols exported with ExportEnvironment will become unavailable. Any shared global variables will become unshared.

Terminating Kernels

When you are done with your parallel computations, close any open remote kernel connections. This frees the resources occupied on the remote machines and closes the open network connections.
Close[kernel]closes the given connection link and removes it from $Slaves
CloseSlaves[]terminates all open connections

Terminating kernels.

Note that exiting the local master kernel may or may not close the open connections cleanly. Always use CloseSlaves[] before exiting the master kernel.