# Random Number Generation

Introduction | Methods |

Random Generation Functions | Statistical Distributions |

Seeding and Localization | References |

## Introduction

The ability to generate pseudorandom numbers is important for simulating events, estimating probabilities and other quantities, making randomized assignments or selections, and numerically testing symbolic results. Such applications may require uniformly distributed numbers, nonuniformly distributed numbers, elements sampled with replacement, or elements sampled without replacement.

The functions RandomReal, RandomInteger, and RandomComplex generate uniformly distributed random numbers. RandomVariate generates numbers for built-in distributions. RandomPrime generates primes within a range. The functions RandomChoice and RandomSample sample from a list of values with or without replacement. The elements may have equal or unequal weights. A framework is also included for defining additional methods and distributions for random number generation.

A sequence of nonrecurring events can be simulated via RandomSample. For instance, the probability of randomly sampling the integers 1 through in order might be simulated.

In[10]:= |

Out[10]= |

In[11]:= |

Out[11]= |

Random number generation is at the heart of Monte Carlo estimates. An estimate of an expected value of a function can be obtained by generating values from the desired distribution and finding the mean of applied to those values.

In[12]:= |

Out[12]= |

In[13]:= |

Out[13]= |

Random processes can be simulated by generating a series of numbers with the desired properties. A random walk can be created by recursively summing pseudorandom numbers.

In[14]:= |

Out[14]= |

Substitution of random numbers can be used to test the equivalence of symbolic expressions. For instance, the absolute difference between two expressions could be evaluated at randomly generated points to test for inequality of the expressions.

In[15]:= |

Out[15]= |

In[16]:= |

Out[16]= |

RandomPrime chooses prime numbers with equal probability, which can be useful—for instance, to generate large primes for RSA encryption. The prime numbers are uniformly distributed on the primes in the range but are not uniformly distributed on the entire range because primes are in general not uniformly distributed over ranges of positive integers.

In[17]:= |

Out[17]= |

## Random Generation Functions

The main functions are RandomReal, RandomInteger, RandomComplex, RandomVariate, RandomChoice, and RandomSample. RandomReal, RandomInteger, and RandomComplex generate numbers given some range of numeric values. RandomVariate generates numbers from a statistical distribution. RandomChoice and RandomSample generate elements from finite sets that may include non-numeric values.

### Random Numbers

RandomReal generates pseudorandom real numbers over a specified range of real values. RandomInteger generates pseudorandom integer numbers over a specified range of integer values. RandomComplex generates pseudorandom complex numbers over a specified rectangular region in the complex plane. RandomVariate generates pseudorandom numbers from a specified statistical distribution. RandomPrime generates prime numbers with equal probability within a range.

RandomReal[] | give a pseudorandom real number in the range 0 to 1 |

RandomReal[{x_{min},x_{max}}] | give a pseudorandom real number in the range to |

RandomReal[x_{max}] | give a pseudorandom real number in the range 0 to |

RandomReal[domain,n] | give a list of n pseudorandom reals |

RandomReal[domain,{n_{1},n_{2},...}] | give an ××... array of pseudorandom reals |

RandomInteger[{i_{min},i_{max}}] | give a pseudorandom integer in the range |

RandomInteger[i_{max}] | give a pseudorandom integer in the range |

RandomInteger[] | pseudorandomly give 0 or 1 with probability |

RandomInteger[domain,n] | give a list of n pseudorandom integers |

RandomInteger[domain,{n_{1},n_{2},...}] | give an ××... array of pseudorandom integers |

Generation of random integers.

RandomComplex[] | give a pseudorandom complex number in the unit square |

RandomComplex[{z_{min},z_{max}}] | give a pseudorandom complex number in the rectangle bounded by and |

RandomComplex[z_{max}] | give a pseudorandom complex number in the rectangle bounded by 0 and |

RandomComplex[domain,n] | give a list of n pseudorandom complex numbers |

RandomComplex[domain,{n_{1},n_{2},...}] | give an ××... array of pseudorandom complex numbers |

Generation of random complex numbers.

RandomComplex[dist] | give a pseudorandom value from the distribution dist |

RandomComplex[dist,n] | give a list of n pseudorandom values from dist |

RandomComplex[dist,{n_{1},n_{2},...}] | give an ××... array of pseudorandom values from dist |

Generation of random values from a distribution.

RandomPrime[{i_{min},i_{max}}] | give a pseudorandom prime in the range |

RandomPrime[i_{max}] | give a pseudorandom prime in the range 2 to |

RandomPrime[domain,n] | give a list of n pseudorandom primes |

RandomPrime[domain,{n_{1},n_{2},...}] | give an ××... array of pseudorandom primes |

When the domain is specified in terms of and , RandomReal and RandomInteger generate uniformly distributed numbers over the specified range. RandomVariate uses rules defined for the specified distribution. Additionally, mechanisms are included for defining new methods and distributions.

The two-argument interface provides a convenient way to obtain multiple random numbers at once. Even more importantly, there is a significant efficiency advantage to generating a large number of pseudorandom numbers at once.

In[18]:= |

Out[18]= |

In[19]:= |

Out[19]= |

For multidimensional arrays with dimensions through , the total number of required pseudorandom numbers is generated and then partitioned. This makes the multidimensional array generation as efficient as possible because the total number of random values is generated as efficiently as possible and the time required for partitioning is negligible.

In[20]:= |

Out[20]= |

In[21]:= |

Out[21]= |

In[22]:= |

Out[22]= |

For statistical distributions, the speed advantage of generating many numbers at once can be even greater. In addition to the efficiency benefit inherited from the uniform number generators used, many statistical distributions also benefit from vectorized evaluation of elementary and special functions. For instance, WeibullDistribution benefits from vector evaluations of the elementary functions Power, Times, and Log.

In[23]:= |

Out[23]= |

In[24]:= |

Out[24]= |

Random number generation can be useful in exploratory investigations. For instance, you might look for occurrences of a random sequence of digits in a longer sequence of digits.

In[25]:= |

Out[25]= |

In[26]:= |

In[27]:= |

Out[27]= |

Random number generation is also highly useful in estimating distributions for which closed-form results are not known or known to be computationally difficult. Properties of random matrices provide one example.

In[28]:= |

Out[28]= |

In[29]:= |

Out[29]= |

An example of simulating a multivariate distribution is the Gibbs sampler used in Bayesian statistics [1]. The Gibbs sampler provides a means by which to simulate values from multivariate distributions provided the distributions of each coordinate conditional on the other coordinates are known. Under some restrictions, the distribution of random vectors constructed by iteratively sampling from the conditional distributions will converge to the true multivariate distribution.

The following example will construct a Gibbs sampler for an example given by Casella and George [2]. The distribution of interest is bivariate. The conditional distribution of given is a binomial, and the conditional distribution of given is a beta. As Casella and George mention, various strategies for detecting convergence and sampling using the Gibbs sampler have been suggested. For simplicity, assume that convergence will occur within 1000 iterations. A sample of size from the distribution will be taken as the values following the 1000 iteration. It should be noted that these values will, however, be dependent.

In[30]:= |

A Gibbs sampler could also be defined as a distribution object within the distribution framework for random number generation. An example of this particular Gibbs sampler as a distribution object is provided in "Defining Distributional Generators".

In[31]:= |

In[32]:= |

Out[33]= |

In[34]:= |

Out[34]= |

Conditional distributions should closely match the assumed binomial and beta distributions provided there is enough data for the conditional distribution. The greatest amount of data occurs when the densities of the marginal distributions are highest, so those values can be used for comparisons. The following graphics compare the empirical and assumed conditional distributions, using bins of width .05 for estimating probabilities of continuous values.

In[35]:= |

Out[35]= |

In[36]:= |

Out[36]= |

#### Arbitrary-Precision Reals and Complexes

By default, RandomReal and RandomComplex generate machine-precision numbers. RandomVariate generates machine numbers for continuous distributions by default. Arbitrary-precision numbers can be obtained by setting the WorkingPrecision option.

option name | default value | |

WorkingPrecision | MachinePrecision | precision of the arithmetic to use in calculations |

Option for RandomReal, RandomComplex, and RandomVariate.

The option is valid for uniformly distributed reals, complexes, and reals from built-in distributions. WorkingPrecision can also be incorporated into user-defined distributions.

In[37]:= |

Out[37]= |

In[38]:= |

Out[38]= |

Increased WorkingPrecision can be useful in simulations where loss of precision can be expected and highly accurate results are necessary. Increased precision can also be used to estimate the precision loss in computations.

In[39]:= |

Out[39]= |

If the precision of the input is less than the specified WorkingPrecision, the function will warn of the problem. The precision of the input will then be artificially increased to generate a pseudorandom number of the desired precision.

In[40]:= |

Out[40]= |

WorkingPrecision is not an option for RandomInteger. Integers have infinite precision, so the precision is completely specified by the function name.

In[41]:= |

Out[41]= |

### Random Elements

RandomChoice and RandomSample generate pseudorandom selections from a list of possible elements. The elements can be numeric or non-numeric.

RandomChoice[{e_{1},e_{2},...}] | give a pseudorandom choice of one of the |

RandomChoice[list,n] | give a list of n pseudorandom choices from list |

RandomChoice[list,{n_{1},n_{2},...}] | give ××... pseudorandom choices from list |

RandomChoice[{w_{1},w_{2},...}->{e_{1},e_{2},...}] | |

give a pseudorandom choice weighted by the | |

RandomChoice[wlist->elist,n] | give a list of n weighted choices |

RandomChoice[wlist->elist,{n_{1},n_{2},...}] | |

give an array of ××... array of weighted choices |

RandomSample[{e_{1},e_{2},...},n] | give a pseudorandom sample of n of the |

RandomSample[{w_{1},w_{2},...}->{e_{1},e_{2},...},n] | |

give a pseudorandom sample of n of the chosen using weights | |

RandomSample[{e_{1},e_{2},...}] | give a pseudorandom permutation of the |

RandomSample[wlist->elist] | give a pseudorandom permutation of elist using initial weights wlist |

The main difference between RandomChoice and RandomSample is that RandomChoice selects from the with replacement, while RandomSample samples without replacement. The number of elements chosen by RandomChoice is not limited by the number of elements in elist, and an element may be chosen more than once. The size of a sample returned by RandomSample is limited by the number of elements in elist, and the number of occurrences of a distinct element in that sample is limited by the number of occurrences of that element in elist.

If the first argument to RandomChoice or RandomSample is a list, elements are selected with equal probability. The weight specification defines a distribution on the set of the . The weights must be positive, but need not sum to 1. For weights the probability of in the initial distribution is . Since RandomSample samples without replacement, weights are updated internally based on the total remaining weight after each selection.

RandomChoice can be used for simulation of independent identically distributed events with a finite list of possible outcomes.

In[42]:= |

Out[42]= |

In[43]:= |

Out[43]= |

RandomChoice can be used to generate observations from any discrete distribution with finite support.

In[44]:= |

Out[44]= |

In[45]:= |

Out[46]= |

RandomSample can be used to simulate observations from a finite set of outcomes in which each element in the list of outcomes can only be observed once. There may be more than one occurrence of distinct values in the list.

In[47]:= |

Out[47]= |

Randomly sampling all elements in the list results in a random permutation.

In[48]:= |

Out[48]= |

Assigning weights to the elements results in a random permutation in which values with greater weight tend to appear earlier in the permutation than values with lesser weight.

In[49]:= |

Out[49]= |

For the same list of weighted or unweighted elements, RandomSample[#, 1]& is distributionally equivalent to RandomChoice.

In[50]:= |

In[51]:= |

Out[51]= |

In[52]:= |

Out[52]= |

In[53]:= |

Out[53]= |

The probabilities for the two examples are very close to each other and to the theoretical values.

In[54]:= |

Out[54]= |

RandomSample can also be used for random assignments to groups, such as in clinical trials. The following uses integers, but other identifying values such as name or identification number could be used instead.

In[55]:= |

Out[55]= |

RandomChoice and RandomSample can be affected by changes to the Method option to SeedRandom. Built-in methods are described in "Methods". Additionally, mechanisms for defining new methods are described in "Defining Your Own Generator".

## Seeding and Localization

Pseudorandom number generators algorithmically create numbers that have some apparent level of randomness. Methods for pseudorandom number generation typically use a recurrence relation to generate a number from the current state and to establish a new state from which the next number will be generated. The state can be set by seeding the generator with an integer that will be used to initialize the recurrence relation in the algorithm.

Given an initial starting point, called a seed, pseudorandom number generators are completely deterministic. In many cases it is desirable to locally or globally set the seed for a random number generator to obtain a constant sequence of "random" values. If set globally, the seed will affect future pseudorandom numbers unless a new seed is explicitly set. If set locally, the seed will only affect random number and element generation within the localized code.

BlockRandom[expr] | evaluate expr with all pseudorandom generators localized |

SeedRandom[n] | reset the pseudorandom generator using n as a seed |

SeedRandom[] | reset the generator using as a seed the time of day and certain attributes of the current Mathematica session |

Localization and seeding functions.

The SeedRandom function provides a means by which to seed the random generator. Used on its own, SeedRandom will globally set the seed for random generators. The BlockRandom function provides a means by which to locally set or change the seed for random generators without affecting the global state.

In[56]:= |

Out[56]= |

The following gives two different numbers because the first RandomReal is generated within BlockRandom, while the second is generated outside of BlockRandom.

In[57]:= |

Out[57]= |

SeedRandom also provides the mechanism for switching the random generator.

Option for SeedRandom.

An individual generator can be seeded directly by specifying that generator via the Method option. All generators can be seeded by setting Method->All.

In[58]:= |

Out[58]= |

In[59]:= |

Out[59]= |

### SeedRandom and BlockRandom in Parallel Computations

There are some subtleties to using the commands SeedRandom and BlockRandom in parallel computations. Within a parallel computation, these commands only affect the generator that is used in the current thread. Typically you will want to use these before or enclosing an entire parallel computation.

For parallel computations it is very advantageous to have a generator on each thread that produces random numbers independent from the generators on other threads. In *Mathematica* each thread used in a parallel computation will be given a unique index starting from zero (and typically going sequentially through $ProcessorCount) that will be used to give different seeds and generators on each thread.

The table below describes some of the differences between using these in serial and parallel.

command | serial | parallel |

SeedRandom[seed] | seed all current serial random generators with seed and the parallel generators with with seed + i, where i is the index for the parallel thread | seed only the random generator for the current thread with seed |

SeedRandom[seed,Method->"ParallelGenerator"] | seed the parallel generators with seed + i, where i is the index for the parallel thread | no effect |

SeedRandom[Method->method] | change the method for the serial random generator to method | change the method for only the random generator for the current thread to method |

BlockRandom[expr] | evaluate expr with all pseudorandom generators localized | evaluate expr with only the pseudorandom generator for the current thread localized |

SeedRandom and BlockRandom in serial and parallel computations.

In[60]:= |

In[61]:= |

Out[61]= |

In[62]:= |

Out[62]= |

The results are different in spite of having the same seed. Most of the difference is in the ordering, since the parallel scheduler may run one thread before another when a computation is repeated.

In[63]:= |

Out[63]= |

Many, but not all of the same results are found in both computations. This is because there is no guarantee that a given thread will be used exactly the same number of times when a computation is repeated.

In[64]:= |

Out[65]= |

Use of SeedRandom and BlockRandom inside a parallel computation should be done with care, since different parts done with the same thread may wind up with identical results.

In[66]:= |

In[67]:= |

Out[80]= |

You may notice that some of the results appear the same. This can be checked using Union.

In[81]:= |

Out[81]= |

So in this case, there are only eight distinct sums out of 20. If you run this, the length of the union will typically be equal to the number of processors you have on your machine. This is because the generator for each processor is reseeded before each use, and since the use of RandomReal in each case is the same, the results are identical.

In[91]:= |

In[92]:= |

Out[92]= |

In[93]:= |

Out[93]= |

One thing you can do with SeedRandom inside a parallel computation is to set the generator. Suppose that you want to set the generator on each thread to be the default generator with different seeds.

In[6]:= |

In[2]:= |

In[3]:= |

In[9]:= |

Out[9]= |

You can verify that these generators were used by comparing to a serial computation where the generator is set the same way.

In[10]:= |

Out[10]= |

The parallel result is just a permutation of this.

In[29]:= |

Out[31]= |

In[32]:= |

Out[32]= |

Setting up generators in this way is not advisable since just changing the seed with the same generator does not give any guarantee that the generated numbers are not correlated in some way.

An easier and more reliable way of setting up parallel generators is provided with the "ParallelGenerator" method described in "Methods".

## Methods

Five pseudorandom generator methods are available on all systems. Of those five, the Mersenne Twister method is provided in both a serial and parallel version. A sixth platform-dependent method is available on Intel-based systems. A method name is used for handling generators for parallel computations. A framework for defining new methods, described in "Defining Your Own Generator", is also included.

"Congruential" | linear congruential generator (low-quality randomness) |

"ExtendedCA" | extended cellular automaton generator (default) |

"Legacy" | default generators prior to Mathematica 6.0 |

"MersenneTwister" | Mersenne Twister shift register generator |

"MKL" | Intel MKL generator (Intel-based systems) |

"ParallelGenerator" | used for initializing and seeding generators for parallel computations. |

"ParallelMersenneTwister" | set of 1024 Mersenne Twister generators of period |

"Rule30CA" | Wolfram rule 30 generator |

In[54]:= |

Out[54]= |

In[55]:= |

Out[55]= |

### Congruential

uses a linear congruential generator. This is one of the simplest types of pseudorandom number generators, with pseudorandom numbers between 0 and 1 obtained from , where is given by the modular recurrence relation

for some fixed integers , , and called the multiplier, increment, and modulus, respectively. If the increment is 0, the generator is a multiplicative congruential generator. The values of , , and can be set via options to the method.

option name | default value | |

"Bits" | Automatic | specify range of bits to use for numbers constructed from bits |

"Multiplier" | 1283839219676404755 | multiplier value |

"Increment" | 0 | increment value |

"Modulus" | 2305843009213693951 | modulus value |

"ConvertToRealsDirectly" | True | whether reals should be constructed directly from the congruence relation |

Options for Method .

Linear congruential generators are periodic and tend to give a lower quality of randomness, especially when a large number of random values is needed. If reals are generated directly from the congruence relation, the period is less than or equal to .

The default option values are chosen to have a large period and for 64-bit efficiency. With the default options, the generator passes many standard tests of randomness despite the inherent issues with congruential number generators.

In[56]:= |

The period of a multiplicative congruential generator is bounded above by the number of positive integers less than or equal to the modulus that are relatively prime to the modulus. This upper bound is Euler's totient function of the modulus.

In[57]:= |

Out[57]= |

The actual period can be determined by finding the smallest integer such that mod *.*

In[58]:= |

Out[58]= |

In[59]:= |

Out[59]= |

The distinct numbers can also be seen graphically by plotting a sequence of generated numbers.

In[60]:= |

Out[60]= |

If is set to False, reals are generated by taking eight bits at a time from elements of the sequence to construct a 52-bit machine-precision number. Congruential numbers generated in this fashion will still cycle, but cycling will depend on repetition in the bit pattern rather than in the initial congruence relation.

The option can be Automatic, a nonzero integer, or a list of two nonzero integers specifying the range of bits in the modulus used for constructing numbers from bits. Automatic uses unless is a power of 2, in which case is used.

### ExtendedCA

The default method makes use of cellular automata to generate high-quality pseudorandom numbers. This generator uses a particular five-neighbor rule, so each new cell depends on five nonadjacent cells from the previous step.

Cellular-automata-based random number generators evolve a state vector of 0s and 1s according to a deterministic rule. For a given cellular automaton, an element (or cell) at a given position in the new state vector is determined by certain neighboring cells of that cell in the old state vector. A subset of cells in the state vectors is then output as random bits from which the pseudorandom numbers are generated.

The cellular automaton used by produces an extremely high level of randomness. It is so high that even using every single cell in output will give a stream of bits that passes many randomness tests, in spite of the obvious correlation between one cell and five previous ones.

Options are included for modifying the size of the state vector, the cells skipped, and the starting cell. The defaults are chosen for quality and speed and there is typically no need to modify these options.

option name | default value | |

"Size" | 80 | state vector size as a multiplier of 64 |

"Skip" | 4 | number of cells to skip |

"Start" | 0 | which cell to start from |

Options for Method .

The length of the state vectors used is by default set to cells. The multiple of 64 can be controlled by the option. Once a state vector is computed by evolving the cellular automaton using the five-neighbor rule, bits are selected for random numbers from bits .

In practice, using every fourth cell in each state vector proves to be sufficient to pass very stringent randomness tests. This is the default used for the option. For even faster random number generation, a setting of 2 or even 1 could be used, but the quality of the random numbers will then decline.

The option tied with a larger and is useful for setting up a family of independent generators that can be used in parallel computations.

In[61]:= |

Out[61]= |

In[62]:= |

Out[62]= |

### Legacy

The method uses the generator called by Random in versions of *Mathematica* prior to Version 6.0. A Marsaglia-Zaman subtract-with-borrow generator is used for reals. The integer generator is based on a Wolfram rule 30 cellular automaton generator. The rule 30 generator is used directly for small integers and used to generate certain bits for large integers.

In[63]:= |

Out[63]= |

In[64]:= |

Out[64]= |

To guarantee consistency with sequences generated prior to Version 6.0, seeds set for the Automatic method are also applied to the method.

### MersenneTwister

uses the Mersenne Twister generator due to Matsumoto and Nishimura [3][4]. The Mersenne Twister is a generalized feedback shift register generator with period .

In[65]:= |

Out[65]= |

### MKL

The method uses the random number generators provided in Intel's MKL libraries. The MKL libraries are platform dependent. The method is available on Microsoft Windows (32-bit, 64-bit), Linux x86 (32-bit, 64-bit), and Linux Itanium systems.

Option for Method .

"MCG31" | 31-bit multiplicative congruential generator |

"MCG59" | 59-bit multiplicative congruential generator |

"MRG32K3A" | combined multiple recursive generators with two components of order 3 |

"MersenneTwister" | Mersenne Twister shift register generator |

"R250" | generalized feedback shift register generator |

"WichmannHill" | Wichmann-Hill combined multiplicative congruential generators |

"Niederreiter" | Niederreiter low-discrepancy sequence |

"Sobol" | Sobol low-discrepancy sequence |

The first six methods are uniform generators. and generate Niederreiter and Sobol sequences. These sequences are nonuniform and have underlying structure which is sometimes useful in numerical methods. For instance, these sequences typically provide faster convergence in multidimensional Monte Carlo integration.

In[66]:= |

Out[66]= |

In[67]:= |

Out[67]= |

### Rule30CA

The method uses a Wolfram rule 30 cellular automaton generator. Bits are obtained by evolving a state vector of 0s and 1s using the relation

where is the value of cell at time .

Option for Method .

The length of the state vectors used is by default set to cells. The multiplier for 29 can be controlled by the option.

In[68]:= |

Out[68]= |

The method uses only the first bit from each state vector, making it slower than the method, which uses multiple bits from each state vector.

### ParallelMersenneTwister

uses a set of Mersenne Twister generators due to Matsumoto and Nishimura [3][4] with parameters chosen using their "Dynamic Creator" program dcmt [19]. The program computes parameters for the Mersenne Twister generator that are relatively prime and so should produce independent results. The parameters were computed to produce Mersenne Twister generalized feedback shift register generators with period .

An option is included to choose which of the set of generators to use.

Option for Method .

In[9]:= |

Out[10]= |

There are no apparent correlations between the numbers produced by the two generators. Because of the lack of correlation and the speed, this set of generators is used as the default generators for parallel computations.

### ParallelGenerator

is a controller method that allows you to seed and change the generators used for parallel computations.

An option is included to choose which of the set of generators to use.

Option for Method .

The value of the Method option given to the method can be a string specifying a built-in parametrized method or a function that will give a random generator specification for non-negative integers. Each thread used in a parallel computation will be given a unique index starting from zero (and typically going sequentially through $ProcessorCount) that will be used to give different seeds and generators on each thread.

"ParallelMersenneTwister" | parallel Mersenne twister generators with period |

"ExtendedCA" | extended CA generators with different starting positions |

f | generator used for the i thread |

"Default" | restores the default method |

Methods for parallel generators.

The string shortcuts are provided as convenient ways to get to two high-quality independent sets of generators.

Using is equivalent to using the function f=Function[{i}, {"ParallelMersenneTwister", "Index"->i}]. This is the default for parallel computations since the generators are fast and produce good quality random numbers.

Using is typically equivalent to using the function f defined below with the number of processors on your machine.

resets the method to the default method.

In[11]:= |

Out[11]= |

The parameters are chosen so that if you use all of the $ProcessorCount processors on your machine you will still get random numbers as good as the default serial random generator.

The method also does generator seeding in a slightly different way. Instead of just using the same seed on each processor, SeedRandom[seed, Method->"ParallelGenerator"] uses on each thread where i is the unique index for that thread. This allows you to get different numbers from different threads even if you set the generator on each thread to be the same (e.g. Method->Function[{i}, "ExtendedCA"]), though that is not advisable since even with different seeds the numbers could have unexpected correlations.

In general, the function f to give generator methods for different threads can return anything that is a legitimate random generator method.

In[1]:= |

In[2]:= |

In[3]:= |

Out[3]= |

In[4]:= |

In[12]:= |

In[13]:= |

Out[13]= |

In[14]:= |

Out[14]= |

In[15]:= |

Out[15]= |

To restore the parallel generators to their default method, you need to explicitly give a method option, otherwise, it just changes the seed.

In[16]:= |

### Defining Your Own Generator

Methods can be plugged into the random framework as long as they follow the correct template. A generator object is of the form where gsym is the symbol that identifies the generator and to which rules are attached. data is effectively private to the top-level evaluations associated with the generator definitions.

Generator initialization is handled by a call to .

Generator initialization function.

is expected to return a generator object gobj of the form .

Generators can support generation of random bit streams, random integers, and random reals. If the generator supports bit streams, reals and integers can be generated by conversion of the bit stream. At method setup time, properties are queried to determine what is supported and how.

GeneratesBitsQ | set to True if the method generates bits |

GeneratesIntegersQ | set to True if the method generates integers for a given range |

GeneratesRealsQ | set to True if the method generates reals for a given range and precision |

If bit streams are supported, then is expected to return an integer comprised of n random bits or a list of length nbits with entries that are 0 or 1.

If random integers are supported, then is expected to return a list of n random integers in the range . A warning message will be issued when results are out of range.

If random reals are supported, then is expected to return a list of n random reals with precision prec in the range . A warning message will be issued when results are out of range or of the wrong precision.

For any of the generation functions, the return can be , where res is the result of the correct type and gobj is a new generator object (reflecting any state change).

Seeding is done by for an integer seed. is expected to return a new generator object.

#### Example: Multiplicative Congruential Generator

In the following example a multiplicative congruential generator will be defined. A multiplicative congruential generator follows the recurrence relation

The generator, as defined below, will allow only for generation of real numbers.

In[69]:= |

Initialization of the generator will extract the values of the multiplier and modulus. Initialization will fail if either of these values is not a positive integer.

In[70]:= |

Calls from the kernel to are effectively wrapped in Catch. Throw can be used in the initialization code to easily exit in case of problems.

In[71]:= |

In[72]:= |

The real number generator will return the desired number of reals and a new generator. The seed for the new generator is updated based on the recurrence relation.

In[73]:= |

In[74]:= |

Out[74]= |

In[75]:= |

Out[75]= |

#### Example: Blum-Blum-Shub Generator

The Blum-Blum-Shub generator is a quadratic congruential method for generating pseudorandom bits for cryptographic purposes [5]. The congruence is mod × for specified primes and .

In[76]:= |

In[77]:= |

In[78]:= |

In[79]:= |

In[80]:= |

The generator initialization will extract option values and issue error messages if necessary before calling the actual generator.

In[81]:= |

In[82]:= |

In[83]:= |

In[84]:= |

In[85]:= |

In[86]:= |

Out[86]= |

## Statistical Distributions

The general idea behind generating random variates from a nonuniform statistical distribution is to generate a random uniform variate between 0 and 1 and then compute the inverse CDF of that random value in the desired distribution. In practice, however, following this recipe directly can be very computationally intensive if a large number of random variates is desired, particularly when the inverse CDF is complicated or cannot be expressed in a closed form.

In such cases, table lookups, direct construction based on distributional relationships, or acceptance-rejection methods are often more efficient alternatives to direct inversion of the CDF. On some level, these methodologies will all still rely on uniformly distributed RandomReal values, uniformly distributed RandomInteger values, observations from a weighted RandomChoice, or a combination of these values. As a result, methods set via SeedRandom will have an effect on random observations from statistical distributions.

Random observations from all built-in statistical distributions can be generated using RandomVariate. The methods used by RandomVariate for many of the distributions in *Mathematica* follow methods suggested or described in Gentle [6] or other literature.

RandomVariate[dist] | give a random number from the continuous distribution dist |

RandomVariate[dist,n] | give a list of n pseudorandom reals from dist |

RandomVariate[dist,{n_{1},n_{2},...}] | give an ××... array of pseudorandom reals from dist |

Generation of random values from statistical distributions.

Observations from statistical distributions are obtained via RandomVariate. This includes all built-in distributions and constructors including univariate and multivariate distributions, continuous and discrete distributions, parametric and derived distributions, and distributions defined by data.

In[2]:= |

Out[2]= |

WorkingPrecision can be used to get higher-precision values for continuous distributions just as it is for uniform numbers over ranges.

In[3]:= |

Out[3]= |

Random values from multivariate distributions can be generated in the same way.

In[90]:= |

Out[90]= |

In[91]:= |

Out[91]= |

In[8]:= |

Out[9]= |

In the following sections, methodologies for generating random variates are discussed with some specific examples of where such methods are employed in *Mathematica*.

### Continuous Distributions

For univariate distributions whose inverse CDFs contain only elementary functions, direct computation of the inverse CDF for a random uniform is generally used. This can be seen as a direct construction from a uniformly distributed random variable. Some continuous distributions falling in this category include CauchyDistribution, ExponentialDistribution, ExtremeValueDistribution, GumbelDistribution, LaplaceDistribution, LogisticDistribution, ParetoDistribution, RayleighDistribution, TriangularDistribution, and WeibullDistribution.

Direct construction of a single random variate from multiple uniform variates, or from variates other than the uniform distribution are also employed. Normal variates are generated in pairs from pairs of random uniforms using the Box-Müller method. HalfNormalDistribution and LogNormalDistribution, and MultinormalDistribution variates, for instance, are obtained by direct transformation of normal variates.

InverseGaussianDistribution uses an acceptance-complement method involving normal and uniform variates. The method is due to Michael, Schucany, and Haas and described in Gentle [6]. MaxwellDistribution variates are constructed from ChiDistribution variates. The chi variates themselves are obtained from ChiSquareDistribution variates, which are special cases of GammaDistribution variates.

In most cases FRatioDistribution constructs each random value from a single random beta variate. For small degrees of freedom, FRatioDistribution variates are instead generated from pairs of gamma variates to avoid possible divisions by 0 that may arise in the beta construction.

NoncentralChiSquareDistribution[, ], , variate generation uses additive properties of distributions to avoid expensive inverse CDF computations for nonintegral . The additive properties are given in, for instance, Johnson, Kotz, and Balakrishnan [7]. For a noncentral variate can be generated as the square of a normal variate with mean and variance 1. For noncentral variates are obtained as the sum of a central and a noncentral random variable. For , is distributed if and . This relationship cannot be used for . In that case the construction is with and , where is the limiting noncentral distribution as goes to 0. The limiting distribution is a mixture of Poisson and variables, which has a nonzero probability mass at 0 and a continuous density for positive values. NoncentralFRatioDistribution variates are obtained from one central and one noncentral variate.

For the WishartDistribution from the Multivariate Statistics Package, matrices are generated via Smith and Hocking's method [8]. This method constructs Wishart matrices from matrices with chi-distributed diagonal entries and normally distributed off-diagonal entries.

NoncentralStudentTDistribution, HotellingTSquareDistribution, and MultivariateTDistribution each use direct construction from univariate random variates.

GammaDistribution, BetaDistribution, and StudentTDistribution use acceptance-rejection methods to some extent.

For GammaDistribution[, ] exponential variates are generated when . Otherwise, methods due to Cheng and Feast [9] and Ahrens and Dieter [10] are used.

Beta variates are constructed by switching between multiple methods, depending on the values of the beta parameters and . If both parameters are 1, uniform random variates will be generated. If one of the beta parameters is 1, then a closed-form inverse CDF evaluation is used. Otherwise, RandomVariate switches between acceptance-rejection methods due to Jöhnk [11], Cheng [12], and Atkinson [13]. An example of the advantage of using an acceptance-rejection method over construction from two gammas can be seen in the following. The direct acceptance-rejection method is nearly twice as fast as the gamma-pair construction.

In[92]:= |

Out[92]= |

In[93]:= |

Out[93]= |

For StudentTDistribution the method used by RandomVariate is a polar rejection method due to Bailey [14]. This method is more efficient than direct construction from normal and variates, as can be seen in the following. The direct construction takes roughly 1.5 times as long as the polar method for a million Student variates.

In[94]:= |

Out[94]= |

In[95]:= |

Out[95]= |

### Discrete Distributions

GeometricDistribution, BetaBinomialDistribution, and BetaNegativeBinomialDistribution use direct construction. GeometricDistribution variates are generated as where follows UniformDistribution[0, 1]. BetaBinomialDistribution and BetaNegativeBinomialDistribution are constructed from BinomialDistribution and NegativeBinomialDistribution variates with probability parameters taken as random BetaDistribution variates.

When used, table lookups for random integer generation are implemented via RandomChoice using the distribution's probability mass function for the weights. Most discrete distributions switch to other methods whenever construction of the list of weights is expected to be expensive given the desired sampled size. For example, as approaches 1 LogSeriesDistribution[p] switches to the direct construction , where and are uniformly distributed on the interval [15]. Depending on parameters and sample size, NegativeBinomialDistribution[n, p] may switch to construction as a Poisson-gamma mixture, which is a Poisson variable with mean following a gamma distribution [6].

BinomialDistribution, HypergeometricDistribution, and PoissonDistribution rely on direct sampling from the density function if the computational overhead of computing the PDF values is small relative to the number of desired random values. Otherwise they switch to acceptance-rejection methods. The acceptance-rejection methods also allow for generation of variates when overflows or underflows would occur in directly computing the PDF values, thus extending the range of parameter values for which random numbers can be generated.

The binomial and hypergeometric distributions switch to acceptance-rejection methods due to Kachitvichyanukul and Schmeiser, with small modifications. The binomial method, based on the acceptance-rejection portion of their BTPE (Binomial, Triangle, Parallelogram, Exponential) algorithm [16], effectively uses a piecewise majorizing function with three regions and a triangular minorizing function for a quick acceptance test. The majorizing and minorizing functions create a two-parallelogram envelope around the center of the rescaled binomial density, and the tails of the majorizing function form exponential envelopes on the tails of the scaled binomial distribution. One case where it is clearly better to use BTPE rather than to construct a lookup table is when few observations are desired and the lookup table would be large.

The hypergeometric method, based on the acceptance-rejection portion of Kachitvichyanukul and Schmeiser's H2PE algorithm [17], uses a majorizing function with three regions around a scaled hypergeometric density. The middle portion of the density is enveloped by a rectangular region and the tails of the distribution are bounded by exponentials.

The acceptance-rejection method used by PoissonDistribution is due to Ahrens and Dieter [18]. The acceptance and rejection is carried out using discrete normal variates, taking advantage of the tendency of PoissonDistribution[] toward NormalDistribution[, ]as increases.

Random values from the ZipfDistribution are generated via an acceptance-rejection method described by Devroye [15]. The method uses pairs of uniform variates and a test involving only a Floor and noninteger powers, aside from basic arithmetic, to efficiently obtain Zipf-distributed values.

### Defining Distributional Generators

A number of distribution constructors are included in *Mathematica* which make it possible to define new distribution objects which can be treated like any other distribution. This includes random number generation. Suppose, however, that you are only interested in generating random values from a distribution and have an algorithm for doing so. In such cases it can be beneficial to just define the random number generator. Definitions for such distributional generators are supported through rules for .

Random`DistributionVector[expr,n,prec] | |

defines rules for generating n observations from expr with precision prec |

Function for defining random generation from distributions.

is expected to return a vector of the given length with numbers of the given precision. Because the expression expr is not a completely defined distribution object, the numbers will be generated via RandomReal or RandomInteger instead of RandomVariate. If the precision is Infinity, the values will be generated via RandomInteger. Otherwise, values will be generated via RandomReal.

Rules for generating random values from distributions are generally defined via a TagSet on the head of the distribution. The distribution itself may contain parameters. As a simple example, the following defines rules for , which represents a uniform distribution on the interval .

In[96]:= |

Random numbers from can now be generated via RandomReal.

In[97]:= |

Out[97]= |

Matrices and higher-dimensional tensors can also be generated directly via RandomReal. RandomReal uses the definition given to to generate the total number of random values desired, and partitions that total number into the specified dimensions.

In[98]:= |

Out[98]= |

Discrete distributional generators can be defined in a similar way. The main difference is that the precision argument to will now be Infinity. The discrete version of provides a simple example.

In[99]:= |

Random values from can now be obtained from RandomInteger.

In[100]:= |

Out[100]= |

While the previous examples show the basic framework for defining distributional generators, the distributions themselves are not particularly interesting. In fact, it would have been easier in these two cases to just generate values from RandomVariate and multiply the end result by instead of attaching definitions to a new symbol. The following examples will demonstrate slightly more complicated distributions, in which case attaching definitions to a symbol will be more useful.

#### Example: Normal Distribution by Inversion

The textbook definition for generating random values from a generic univariate statistical distribution involves two steps:

To demonstrate the process, use the normal distribution. To generate random normal variates using this method, start with the Quantile for the normal distribution at a point and replace with a random uniform between 0 and 1.

In[101]:= |

Out[101]= |

A new distribution object can now be used to define a normal random number generator that uses inversion of the cumulative distribution function.

In[102]:= |

In[103]:= |

Out[103]= |

In[104]:= |

Out[104]= |

In[105]:= |

Out[105]= |

The normal distribution is one example where methods other than direct inversion are generally preferred. While inversion of the CDF is a perfectly valid method for generating pseudorandom normal numbers, it is not particularly efficient. Numeric evaluation of InverseErf is computationally much more expensive than the sinusoid and logarithmic evaluations required by the Box-Müller method used by NormalDistribution.

In[106]:= |

Out[106]= |

In[107]:= |

Out[107]= |

In[108]:= |

#### Example: Uniform Distribution on a Disk

can also be used to define generators for multidimensional distributions. For instance, suppose a random point from a uniform distribution on the unit disk, the set of real points with , is desired. Such a random point can be constructed as follows:

The returned ordered pair can be multiplied by to generate points uniformly distributed on a disk of radius .

In[109]:= |

In[110]:= |

Out[110]= |

In[111]:= |

Out[111]= |

#### Example: Gibbs Sampler

Gibbs samplers can also be defined as distributional generators. As an example consider a Gibbs sampler that mixes beta and binomial distributions. A specific case of this sampler was explored in a previous example. Here, the distribution will be defined with two parameters m and .

In[114]:= |

For the specific Gibbs sampler constructed earlier, m was 16 and was 2.

In[115]:= |

Out[115]= |

## References

[1] Geman, S. and D. Geman. "Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images." *IEEE Transactions on Pattern Analysis and Machine Intelligence* 6, no. 6 (1984): 721-741.

[2] Casella, G. and E. I. George. "Explaining the Gibbs Sampler." *The American Statistician* 46, no. 3 (1992): 167-174.

[3] Matsumoto, M. and T. Nishimura. "Mersenne Twister: A 623-Dimensionally Equidistributed Uniform Pseudorandom Number Generator." *ACM Transactions on Modeling and Computer Simulation* 8, no. 1 (1998): 3-30.

[4] Nishimura, T. "Tables of 64-Bit Mersenne Twisters." *ACM Transactions on Modeling and Computer Simulation* 10, no. 4 (2000): 348-357.

[5] Junod, P. "Cryptographic Secure Pseudo-Random Bits Generation: The Blum-Blum-Shub Generator." August 1999. http://crypto.junod.info/bbs.pdf

[6] Gentle, J. E. *Random Number Generation and Monte Carlo Methods*, 2nd ed. Springer-Verlag, 2003.

[7] Johnson, N. L., S. Kotz, and N. Balakrishnan. *Continuous Univariate Distributions, Volume 2*, 2nd ed. John Wiley & Sons, 1995.

[8] Smith, W. B. and R. R. Hocking. "Algorithm AS 53: Wishart Variate Generator." *Applied Statistics* 21, no. 3 (1972): 341-345.

[9] Cheng, R. C. H. and G. M. Feast. "Some Simple Gamma Variate Generators." *Applied Statistics* 28, no. 3 (1979): 290-295.

[10] Johnson, M. E. *Multivariate Statistical Simulation*. John Wiley & Sons, 1987.

[11] Jöhnk, M. D. "Erzeugung von Betaverteilten und Gammaverteilten Zufallszahlen." *Metrika* 8 (1964): 5-15.

[12] Cheng, R. C. H. "Generating Beta Variables with Nonintegral Shape Parameters." *Communications of the ACM* 21, no. 4 (1978): 317-322.

[13] Atkinson, A. C. "A Family of Switching Algorithms for the Computer Generation of Beta Random Variables." *Biometrika* 66, no. 1 (1979): 141-145.

[14] Bailey, R. W. "Polar Generation of Random Variates with the *t*-Distribution." *Mathematics of Computation* 62, no. 206 (1994): 779-781.

[15] Devroye, L. *Non-Uniform Random Variate Generation*. Springer-Verlag, 1986.

[16] Kachitvichyanukul, V. and B. W. Schmeiser. "Binomial Random Variate Generation." *Communications of the ACM* 31, no. 2 (1988): 216-223.

[17] Kachitvichyanukul, V. and B. W. Schmeiser. "Computer Generation of Hypergeometric Random Variates." *Journal of Statistical Computation and Simulation* 22, no. 2 (1985): 127-145.

[18] Ahrens, J. H. and U. Dieter. "Computer Generation of Poisson Deviates from Modified Normal Distributions." *ACM Transactions on Mathematical Software* 8, no. 2 (1982): 163-179.

[19] Matsumoto, M. and T. Nishimura. "Dynamic Creation of Pseudorandom Number Generators." In *Proceedings of the Third International Conference on Monte Carlo and Quasi-Monte Carlo Methods in Scientific Computing*: *Monte Carlo and Quasi-Monte Carlo Methods 1998,* 56-69, 2000.