Legacy Documentation

Parallel Computing Toolkit (2000)

This is documentation for an obsolete product.
Current products and services
Previous section-----Next section

Virtual Shared Memory

Shared Memory versus Distributed Memory

Special-purpose multiprocessing hardware comes in two types, shared memory and distributed memory. In a shared-memory machine, all processors have access to a common main memory. In a distributed-memory machine, each processor has its own main memory, and the processors are connected through a sophisticated network. A collection of networked PCs is also a kind of distributed-memory parallel machine.
Communication between processors is an important prerequisite for all but the most trivial parallel processing tasks. In a shared-memory machine, a processor can simply write a value into a particular memory location, and all other processors can read this value. In a distributed-memory machine, exchanging values of variables involves explicit communication over the network.

Virtual Shared Memory

Virtual shared memory is a programming model that allows processors on a distributed-memory machine to be programmed as if they had shared memory. A software layer takes care of the necessary communication in a transparent way.
Parallel Computing Toolkit uses independent Mathematica kernels as parallel processors. It is clear that these kernels do not share a common memory, even if they happened to reside on the same machine. The package Parallel`VirtualShared`, which is part of the Toolkit, implements virtual shared memory for these remote kernels.
The package is normally set up to be autoloaded the first time you declare a shared variable. To load it explicitly, use
The result is a simple programming model. If a variable a is shared, any kernel that reads the variable (simply by evaluating it), reads a common value that is maintained by the master kernel. Any kernel that changes the value of a, for example by assigning it with a = val, will modify the one global copy of the variable a, so that all other kernels that subsequently read the variable will see its new value.
The drawback of a shared variable is that every access for read or write requires communication over the network, so it is slower than access to a local unshared variable.

Declaring Shared Variables

SharedVariables[{s1, s2, ...}]declares the symbols si as shared variables
SharedVariables[s1, s2, ...]same as SharedVariables[{s1, s2, ...}]
SharedDownValues[{f1, f2, ...}]declares the symbols fi as shared functions or data types
SharedDownValues[f1, f2, ...]same asSharedDownValues[{f1, f2, ...}]

Declaring shared variables and downvalues.

The command SharedVariables has the attribute HoldAll to prevent evaluation of the given variables, which usually have values.
The effect of SharedVariables or SharedDownValues is that all currently connected and newly launched remote kernels will perform all accesses to the shared variables through the master kernel.
$SharedVariablesthe list of currently shared variables (wrapped in Hold[])
$SharedDownValuesthe list of currently shared downvalues (wrapped in Hold[])
ClearShared[s1, s2, ...]unshares the given variables or downvalues
ClearShared[]unshares all variables and downvalues

Manipulating the set of shared variables and downvalues.

Clearing kernels with ClearSlaves[] will also clear any shared variables and downvalues.

SharedVariables

A variable s that has been declared shared with SharedVariables[s] exists only in the master (local) kernel. The following operations on a remote kernel are redefined so that they have the described effect.
sevaluation of the variable will consult the master kernel for the variable's current value
s = e, s := eassigning a value to s will perform the assignment in the master kernel
s++, s--, ++s, --sthe increment/decrement operation is performed in the master kernel (this operation is atomic and can be used for synchronization)
TestAndSet[s, e]if s has no value or its value is Null, set the value to e; otherwise, do nothing and return the current value of s (this operation is atomic and can be used for synchronization)
Part[Unevaluated[s],i]extracts a part of s; the operation will transmit only the requested part over the MathLink connection, not the whole value of s
s[[i]]= ereplace the specified part of the variable with a new value; the old value of s must have the necessary structure to permit the part assignment

Operations on shared variables.

For technical reasons, every shared variable must have a value. If the variable in the master kernel does not have a value, it is set to Null.
Note that other forms of assignments, such as conditional assignments involving side conditions, are not supported.
The customary form of part extraction, s[[i]], will transmit the whole value of s to the slave kernels. Use Part[Unevaluated[s],i] to transmit only the ith component.
If a variable is Protected at the time you declare it as shared, remote kernels can only access the variable, but not change its value.

SharedDownValues

A symbol f that has been declared shared with
exists only in the master (local) kernel. The following operations on a remote kernel are redefined so that they have the described effect.
f[i], f[i, j], ...evaluation of the function or array element f[i], and so forth, will consult the master kernel for the symbol's current downvalue
f[i]=e, f[i, j]=e, f[i]:=e, ...defining a value for f[i], and so forth, will perform the definition in the master kernel
f[[i]]++, f[[i]]--, ++f[[i]], --f[[i]]the increment/decrement operation is performed in the master kernel (this operation is atomic and can be used for synchronization)
TestAndSet[f[i], e]If f[i] has no value or its value is Null, set the value to e; otherwise, do nothing and return the current value of f[i] (this operation is atomic and can be used for synchronization)

Operations on shared functions.

For technical reasons, every expression of the form f[...] must have a value. If the expression f[...] in the master kernel does not evaluate, the result is set to Null.
Note that other forms of assignments, such as conditional assignments involving side conditions, are not supported.
You can define shared functions, as in the following. Be sure that the symbol x does not have a value in either the remote kernels or in the master kernel. The symbol x should not be a shared variable.
If you make a delayed assignment on a remote kernel, the right side of the definition will be evaluated on the remote kernel when you use the function. In an immediate assignment, it is evaluated on the master kernel.
If you make a delayed assignment on the master kernel, the right side of the definition will be evaluated on the master kernel when you use the function. To cause the right side to be evaluated on the remote kernel nevertheless, use SendBack[]:
You can implement indexed variables or arrays using shared downvalues of the form x[1], x[2], and so forth.
If a function is Protected when you declare it as shared, remote kernels can only use it, but not change its definition.

Basic Example

Load the toolkit package, then start a few local or remote kernels.
In[1]:=
Assign the initial value 17 to x and declare x as a shared variable.
In[2]:=
At least two remote kernels should be running. Assign them to two variables for easier use.
In[4]:=
The kernel r1 now has access to the common value of x.
In[5]:=
Out[5]=
Kernel r2 can change the value of x to 18.
In[6]:=
Out[6]=
The local copy of x on the master kernel has been changed as well.
In[7]:=
Out[7]=
Kernel r1 sees the new value, too.
In[8]:=
Out[8]=

Synchronization

In a situation where several concurrently running remote kernels access the same shared variable for reading and writing, there is no guarantee that the value of a variable is not changed by another process between the time you read a value and write a new value. Any other new value that another process wrote in the meantime would get overwritten.

Example: Critical Sections

This classic example of uncontrolled access to a shared variable illustrates the problem. To try out this example, you should have between two and 10 remote kernels running.
The code inside the first argument of ParallelMap is the client code that is executed independently on the available remote kernels. The code reads the shared variable y, stores its value in a local variable a, performs some computations (here simulated with Pause), and then wants to increment the value of y by setting it to a + 1. But by that time, the value of y is most likely no longer equal to a, because another process will have changed it.
In[9]:=
In[10]:=
Out[11]=
If this code were run sequentially (by changing ParallelMap into Map), the final value of y would be 10, but with enough parallel processes, it will most likely be lower.
In[12]:=
Out[12]=
The code between reading the variable y and setting it to a new value is called a critical section. During its execution, no other process should read or write y. To reserve a critical section, a process can acquire an exclusive lock before entering the critical section and release the lock after leaving the critical section.
Parallel Computing Toolkit provides the operation TestAndSet[lck,e] to acquire a lock. The argument lck must be a shared variable; once a process has set lck to a unique value, no other process should set lck. To release the lock, the process which acquired it simply sets lck to Null.
Here is the previous example with the additional code to implement locking. The processes use the unique integer function parameter # as the value of the lock. This value is guaranteed to be different for each process. To acquire the lock, a process performs TestAndSet[lck,#] and then checks whether the result is equal to #. If it is, the locking was successful. If it is not, some other process currently holds the lock. The returned value will even tell you which process holds the lock.
If locking fails, you have no choice but to try again until you eventually succeed. Note that between attempts to acquire the lock (inside While) the process waits for a while. Otherwise, processes waiting to acquire a lock that is reserved for another process will put a heavy load onto the master kernel.
In[13]:=
In[14]:=
Out[15]=
In[16]:=
Out[16]=
Locking slows down a computation because remote processes may have to wait for one another. In this example the result is essentially sequential execution. You should keep critical sections as short as possible. If a process sets a lock but never releases it, a deadlock may occur in which any other process waiting to acquire the lock will wait forever.
You can also use other atomic operations such as lck++ for locking purposes.

Tracing the Computation of This Example

For debugging shared variable operations, you can enable tracing provided you loaded the debug package before the toolkit itself.
In[1]:=
In[2]:=
In[3]:=
Now you can enable SharedMemory tracing.
In[4]:=
Out[4]=
In[5]:=
Out[6]=
In[7]:=