Virtual Shared Memory

Shared Memory versus Distributed Memory

Special-purpose multiprocessing hardware comes in two types, shared memory and distributed memory. In a shared-memory machine, all processors have access to a common main memory. In a distributed-memory machine, each processor has its own main memory, and the processors are connected through a sophisticated network. A collection of networked PCs is also a kind of distributed-memory parallel machine.

Communication between processors is an important prerequisite for all but the most trivial parallel processing tasks. In a shared-memory machine, a processor can simply write a value into a particular memory location, and all other processors can read this value. In a distributed-memory machine, exchanging values of variables involves explicit communication over the network.

Virtual Shared Memory

Virtual shared memory is a programming model that allows processors on a distributed-memory machine to be programmed as if they had shared memory. A software layer takes care of the necessary communication in a transparent way.

The Wolfram Language uses independent kernels as parallel processors. It is clear that these kernels do not share a common memory, even if they happen to reside on the same machine. However, the Wolfram Language provides functions that implement virtual shared memory for these remote kernels.

This is done with a simple programming model. If a variable a is shared, any kernel that reads the variable (simply by evaluating it), reads a common value that is maintained by the master kernel. Any kernel that changes the value of a, for example by assigning it with a=val, will modify the one global copy of the variable a, so that all other kernels that subsequently read the variable will see its new value.

The drawback of a shared variable is that every access for read or write requires communication over the network, so it is slower than access to a local unshared variable.

Declaring Shared Variables and Functions

SetSharedVariable[s1,s2,]declare the symbols si as shared variables
SetSharedFunction[f1,f2,]declare the symbols fi as shared functions or data types

Declaring shared variables and functions.

The command SetSharedVariable has the attribute HoldAll to prevent evaluation of the given variables, which usually have values.

The effect of SetSharedVariable or SetSharedFunction is that all currently connected and newly launched remote kernels will perform all accesses to the shared variables through the master kernel.

$SharedVariablesthe list of currently shared variables (wrapped in Hold[])
$SharedFunctionsthe list of currently shared functions (wrapped in Hold[])
UnsetShared[s1,s2,]stop the sharing of the given variables or functions
UnsetShared[patt]stop the sharing of all variables and functions whose names match the string pattern patt

Manipulating the set of shared variables and functions.

Clearing kernels with Parallel`Developer`ClearSlaves[] will also clear any shared variables and downvalues.

Shared Variables

A variable s that has been declared shared with SetSharedVariable[s] exists only in the master (local) kernel. The following operations on a remote kernel are redefined so that they have the described effect.

sevaluation of the variable will consult the master kernel for the variable's current value
s=e,s:=eassigning a value to s will perform the assignment in the master kernel
s++,s--,++s,--s,s+=k,s-=k,s*=k,s/=k,AppendTo[s,k]the increment/decrement operation is performed in the master kernel (this operation is atomic and can be used for synchronization)
Part[Unevaluated[s],i]extract a part of s; the operation will transmit only the requested part over the Wolfram Symbolic Transfer Protocol (WSTP) connection, not the whole value of s
s[[i]]=ereplace the specified part of the variable with a new value; the old value of s must have the necessary structure to permit the part assignment

Operations on shared variables.

For technical reasons, every shared variable must have a value. If the variable in the master kernel does not have a value, it is set to Null.

Note that other forms of assignments, such as conditional assignments involving side conditions, are not supported.

The customary form of part extraction, s[[i]], will transmit the whole value of s to the slave kernels. Use Part[Unevaluated[s],i] to transmit only the i th component.

If a variable is Protected at the time you declare it as shared, remote kernels can only access the variable, but not change its value.

Basic Example

Start a few local or remote kernels.

Assign the initial value 17 to x and declare x as a shared variable.

At least two remote kernels should be running. Assign them to two variables for easier use.

The kernel r1 now has access to the common value of x.

Kernel r2 can change the value of x to 18.

The local copy of x on the master kernel has been changed as well.

Kernel r1 sees the new value, too.

Shared Functions

A symbol f that has been declared shared with SetSharedFunction[f] exists only in the master (local) kernel. The following operations on a remote kernel are redefined so that they have the described effect.

f[i],f[i,j],evaluation of the function or array element f[i], and so forth, will consult the master kernel for the symbol's current downvalue
f[i]=e,f[i,j]=e,f[i]:=e,defining a value for f[i], and so forth, will perform the definition in the master kernel
f[[i]]++,f[[i,j]]--,++f[[i]],--f[[i]]the increment/decrement operation is performed in the master kernel (this operation is atomic and can be used for synchronization)

Operations on shared functions.

For technical reasons, every expression of the form f[] must have a value. If the expression f[] in the master kernel does not evaluate, the result is set to Null.

Note that other forms of assignments, such as conditional assignments involving side conditions, are not supported.

You can define shared functions, as in the following. Be sure that the symbol x does not have a value in either the remote kernels or in the master kernel. The symbol x should not be a shared variable.

If you make a delayed assignment on a remote kernel, the right side of the definition will be evaluated on the kernel where you use the function. Any immediate assignment is always evaluated on the master kernel.

You can implement indexed variables or arrays using shared downvalues of the form x[1], x[2], and so forth.

If a function is Protected when you declare it as shared, remote kernels can only use it, but not change its definition.

Synchronization

In a situation where several concurrently running remote kernels access the same shared variable for reading and writing, there is no guarantee that the value of a variable is not changed by another process between the time you read a value and write a new value. Any other new value that another process wrote in the meantime would get overwritten.

Example: Critical Sections

This classic example of uncontrolled access to a shared variable illustrates the problem. To try out this example, you should have between two and 10 remote kernels running.

The code inside the first argument of ParallelMap is the client code that is executed independently on the available remote kernels. The code reads the shared variable y, stores its value in a local variable a, performs some computations (here simulated with Pause), and then wants to increment the value of y by setting it to a+1. But by that time, the value of y is most likely no longer equal to a, because another process will have changed it.

If this code were run sequentially (by changing ParallelMap into Map), the final value of y would be 10, but with enough parallel processes, it will most likely be lower.

The code between reading the variable y and setting it to a new value is called a critical section. During its execution, no other process should read or write y. To reserve a critical section, a process can acquire an exclusive lock before entering the critical section and release the lock after leaving the critical section.

The Wolfram Language provides the function CriticalSection[lck,expr] to acquire a lock, evaluate an expr and then release the lock. Once a process has acquired the lock, no other process can do so. The lock is released when the expression finishes evaluation.

Here is the previous example with the additional code to implement locking. If a kernel fails to acquire a lock, it has no choice but to try again until it eventually succeeds.

Note that between attempts to acquire the lock (inside While) the process waits for a while. Otherwise, processes waiting to acquire a lock that is reserved for another process will put a heavy load onto the master kernel.

Locking slows down a computation because remote processes may have to wait for one another. In this example, the result is essentially sequential execution. You should keep critical sections as short as possible. If two processes each have locks and then try to gain each other's lock, a deadlock will occur in which the process will wait forever.

Tracing the Computation of This Example

For debugging shared variable operations, you can enable tracing provided you loaded the debug package before the toolkit itself.

Now you can enable SharedMemory tracing.