Computer Arithmetic Package

The arithmetic used by Mathematica is a mixture of variable-precision software arithmetic and whatever is provided by the manufacturer of the floating-point hardware (or the designer of the compiler, if there is no floating-point hardware). If you want to learn about the basic ideas of computer floating-point arithmetic in general or examine the machine arithmetic on your machine, you can use . This allows you to examine arithmetic with various bases, precisions, and rounding rules.

ComputerNumber[x]convert the ordinary number x to a computer number in the arithmetic currently in effect
ComputerNumber[sign,mantissa,exp]form the computer number with sign sign, mantissa mantissa, and exponent exp
ComputerNumber[sign,mantissa,exp,value,x]
the complete data object that makes up a computer number
NaNa nonrepresentable number in the current arithmetic

Computer numbers and non-numbers in .

Much of the information carried around in the data object that makes up a computer number is redundant. In particular, the first three arguments contain exactly the same information as the fourth argument. The redundancy exists partly for the sake of efficiency and partly to allow the user access to the various fields. The fifth argument has nothing to do with the computer number itself. It instead represents what the value of the number would be without the cumulative effects of all the roundoff errors that went into the computer number. It is computed using Mathematica's high-precision arithmetic and can generally be regarded as the correct value of the number. Comparing the computer number with this number gives the error in the computer number.

This loads the package.
In[1]:=
Click for copyable input
Here is the computer number representing the ordinary number 2.
In[2]:=
Click for copyable input
Out[2]=
This gives the computer number representation of .
In[3]:=
Click for copyable input
Out[3]=
Arithmetic works with computer numbers just as with ordinary numbers.
In[4]:=
Click for copyable input
Out[4]=
Here is the complete structure of .
In[5]:=
Click for copyable input
Out[5]//InputForm=
You can also enter just the sign, mantissa, and exponent.
In[6]:=
Click for copyable input
Out[6]=
But if your input does not make sense in the arithmetic currently in effect, you get NaN. Here the problem is that the mantissa is only three digits long.
In[7]:=
Click for copyable input
Out[7]=
SetArithmetic[n]set the arithmetic to be n digits, base 10
SetArithmetic[n,b]set the arithmetic to be n digits, base b
Arithmetic[]give the parameters of the arithmetic currently in effect

Changing the type of arithmetic to be used.

The default arithmetic is four digits in base 10 with a rounding rule of RoundToEven. Only numbers between and are allowed. Mixed-mode arithmetic is not allowed and division is not done with correct rounding. (It is performed as two operations: multiplication by the reciprocal. Each operation involves rounding errors.)
In[8]:=
Click for copyable input
Out[8]=
Now the arithmetic is set to be six digits in base 8.
In[9]:=
Click for copyable input
Out[9]=
The result is displayed in octal (with trailing zeros suppressed).
In[10]:=
Click for copyable input
Out[10]=

There are several options that can be used when setting the arithmetic. In addition to changing the precision and the base, you can control the type of rounding used, the magnitude of the numbers allowed, whether mixed-mode arithmetic is to be allowed, and whether division is to be done with a single rounding operation.

option name
default value
RoundingRuleRoundToEventhe type of rounding to be used
ExponentRange{-50,50}the range of exponents that is to be allowed
MixedModeFalsewhether mixed-mode arithmetic is to be allowed
IdealDivisionFalsewhether correctly rounded division is to be used

Options for SetArithmetic.

It should be noted that correctly rounded division is implemented by the function IdealDivide. This function can be used whether or not the arithmetic is set to automatically use correct division. It is difficult to get to give correctly rounded division since the is converted to before the package ever sees it. To get around this, $PreRead is defined to convert into x~IdealDivide~y before the parser has a chance to change it to . If you want to use $PreRead for your own purposes this will interfere with your definition.

RoundToEvenround to the nearest representable number and, in the case of a tie, round to the one represented by an even mantissa
RoundToInfinityround to the nearest representable number and, in the case of a tie, round away from 0
Truncationsimply discard excess digits, much as Floor does for positive numbers

Types of rounding available.

Mixed-mode arithmetic (any arithmetic operation involving an integer and a computer number) is not allowed.
In[11]:=
Click for copyable input
Out[11]=
This turns on mixed-mode arithmetic (and sets the arithmetic to six digits in hexadecimal).
In[12]:=
Click for copyable input
Out[12]=
Now the product is computed.
In[13]:=
Click for copyable input
Out[13]=

There are many things about computer arithmetic that are very different from ordinary arithmetic. Most of these differences are quite easy to demonstrate.

This sets the arithmetic back to the default.
In[14]:=
Click for copyable input
Out[14]=
Expressions are evaluated numerically before they become computer numbers.
In[15]:=
Click for copyable input
Out[15]=
If the separate parts are converted to computer numbers first, catastrophic cancellation of digits can occur.
In[16]:=
Click for copyable input
Out[16]=
Summing the terms from smallest to largest gives one answer.
In[17]:=
Click for copyable input
Out[17]//FullForm=
As a general rule, it is better to sum from smallest to largest. You can see what the error is by comparing the mantissa (the second argument) to the correct value (the last argument).
In[18]:=
Click for copyable input
Out[18]//FullForm=
Here is an example where summing from largest to smallest gives the better result.
In[19]:=
Click for copyable input
Out[19]//FullForm=
The difference is slight, and such examples are rare.
In[20]:=
Click for copyable input
Out[20]//FullForm=
Basic arithmetic is all that is implemented in the package. You could easily extend things to include elementary functions.
In[21]:=
Click for copyable input
Out[21]=
Here is the square root of 47.
In[22]:=
Click for copyable input
Out[22]=
It is a theorem that correctly rounded square roots of small integers will always square back to the original integer if the arithmetic is correct.
In[23]:=
Click for copyable input
Out[23]=
But a similar theorem for cube roots does not exist.
In[24]:=
Click for copyable input
Out[24]=
This changes the arithmetic to seven digits in base 10 with a rounding rule of RoundToEven and an exponent range of -50 to 50.
In[25]:=
Click for copyable input
Out[25]=
This number rounds down.
In[26]:=
Click for copyable input
Out[26]=
This number rounds up because it rounds toward the mantissa that is even (1000000) rather than the one that is odd (9999999).
In[27]:=
Click for copyable input
Out[27]=
Again it rounds toward the even mantissa (1000000 rather than 1000001).
In[28]:=
Click for copyable input
Out[28]=
The reciprocal of the reciprocal is not the original number; in fact it may be quite different.
In[29]:=
Click for copyable input
Out[29]=
This is multiplication by the reciprocal. It involves two rounding operations.
In[30]:=
Click for copyable input
Out[30]=
This is true division. It uses a single rounding operation. You can get this better type of division to work with by setting the option IdealDivide to True when you use SetArithmetic.
In[31]:=
Click for copyable input
Out[31]=
Normal converts a computer number back to an ordinary rational number with exactly the same value.
In[32]:=
Click for copyable input
Out[32]=
Now you change the arithmetic again.
In[33]:=
Click for copyable input
Out[33]=
This suppresses error messages that will result from plotting non-numeric values in the following examples.
In[34]:=
Click for copyable input
It is easy to plot the error in the computer number representation of each number.
In[35]:=
Click for copyable input
Out[35]=
You can zoom in to see the hole at zero.
In[36]:=
Click for copyable input
Out[36]=

Mathematica uses both the machine arithmetic that is provided by the machine and its own arbitrary-precision arithmetic, which is the same on all machines (except for the variations in the size of the largest representable number and the amount of memory on the machine). This package also deals with the floating-point arithmetic provided by the machine.

Numbers on a computer comprise a discrete set. There are gaps between the numbers and when you do arithmetic the result often is not representable. When the result falls between two representable numbers, the best that can be done is to use the closest representable number as the result instead of the correct result. The set of numbers that can be represented on a computer in floating-point format is commonly referred to as the set of machine numbers. This package allows you to vary certain fundamental parameters that control the arithmetic.

The difference between two consecutive machine numbers is called an ulp (unit in the last place, i.e., one digit in the least significant place). The size of an ulp varies depending on where you are in the set of machine numbers. Between 1 and 2 an ulp is equal to $MachineEpsilon. Between 2 and 4 it is equal to 2$MachineEpsilon. Ideally no function should ever return a result with error exceeding half of an ulp since the distance from the true result to the nearest machine number is always less than half of an ulp, the worst case being when it is exactly halfway between two machine numbers. It is relatively easy to achieve this ideal for the four arithmetic operations and for the square root function. For the elementary transcendental functions it becomes more difficult due to the fact that they cannot be evaluated in a single operation and the combined effects of all the rounding errors would have to be made less than half an ulp. Nevertheless, it is quite possible to design algorithms for these functions such that the error is never more than one ulp.

The Computer Arithmetic Package provides four functions that are useful in examining the machine floating-point arithmetic on your machine.

Ulp[a]give the size of an ulp near a
MachineError[f,x->a]give the error in evaluating f at using machine arithmetic
MicroscopePlot[f,{x,a}] plot f near using machine arithmetic
MicroscopicErrorPlot[f,{x,a}]plot the error in evaluating f near using machine arithmetic

Functions for examining machine floating-point arithmetic.

The default neighborhood of the point used in MicroscopePlot and MicroscopicErrorPlot is from 30 ulps to the left of a to 30 ulps to the right of a, where the ulp used is that defined at , that is, Ulp[a]. Near powers of 2 and near 0 it is not clear what should be done since the size of an ulp changes. Near powers of 2 it is not much of a problem in the abscissa since an ulp is chosen to be the smaller of the two values and the resulting machine numbers just get included multiple times due to rounding effects. You can change the default size of the neighborhood by including a third value in the neighborhood specification.

A problem arises with MicroscopePlot and MicroscopicErrorPlot near places where the function value becomes infinite. Since the function is changing by orders of magnitude very rapidly, it is difficult to choose a good scale that displays the information in which you are interested. In such cases you may have to examine the function several hundreds of ulps away from the point you really want.

MicroscopePlot[f,{x,a,n}]plot f from x=a-n Ulp[a] to x=a+n Ulp[a] using machine arithmetic
MicroscopicErrorPlot[f,{x,a,n}]plot the error in evaluating f from x=a-n Ulp[a] to x=a+n Ulp[a] using machine arithmetic

Controlling the size of the neighborhood.

The option Joined controls the way the plot is drawn. It cannot be overemphasized that the functions discussed here map the machine numbers to the machine numbers. For purposes here, the real numbers in between do not even exist. Thus the plots produced by MicroscopePlot and MicroscopicErrorPlot should be nothing but a set of plotted points. However, it is easier to see what is happening if the points are joined with straight line segments. Finally, if you want to see what a function does as a map from the real numbers to the real numbers, that is, rounding the real argument to the nearest machine number and applying the function to this result, you can do so, but when you see large errors, do not conclude that the function is to blame; it cannot do any better than deal with the machine numbers that it gets as arguments.

option name
default value
JoinedTruethe way the points are to be connected; either True, False, or Real

Option for MicroscopePlot and MicroscopicErrorPlot.

Here is the size of an ulp at 1.
In[37]:=
Click for copyable input
Out[37]=
The size of an ulp changes at powers of 2.
In[38]:=
Click for copyable input
Out[38]=
There is a large hole near 0.
In[39]:=
Click for copyable input
Out[39]=
Here are the combined errors of rounding to the nearest machine number and then taking the square root.
In[40]:=
Click for copyable input
In[41]:=
Click for copyable input
Out[41]=
Here is the error of just the square root.
In[42]:=
Click for copyable input
Out[42]=
For another number close to , the error in the square root function is less than half an ulp.
In[43]:=
Click for copyable input
Out[43]=
The function is not provided by the computer manufacturer, so it is evaluated as a nested sequence of operations that are provided. This shows how much error is involved in the sequence, but this information is rarely useful. For testing purposes you normally want to look at an individual function evaluated at a machine number to avoid the effect of an ill-conditioned function magnifying prior error.
In[44]:=
Click for copyable input
Out[44]=
Here is the Log function evaluated at machine numbers near 7.
In[45]:=
Click for copyable input
Out[45]=
Here is the Log function evaluated at 10+1+10 machine numbers near 7 with the points joined by straight lines.
In[46]:=
Click for copyable input
Out[46]=
And here are the combined effects of rounding real numbers to the nearest machine number and taking the Log of the result. This is not a valid test of the Log function.
In[47]:=
Click for copyable input
Out[47]=
This shows the error in evaluating the sine function at 31 machine numbers near 16. Note how the size of an ulp changes at 16. Note also that rounding has moved some points to coincide so that there appear to be fewer than 31 points. The scale on the vertical axis represents ulps.
In[48]:=
Click for copyable input
Out[48]=
Here is the error in evaluating sine at 21 machine numbers near 31 with the points joined by straight lines.
In[49]:=
Click for copyable input
Out[49]=
And here are the combined effects of rounding real numbers to the nearest machine number and taking the sine of the result. This is not a valid test of the sine function. The error in evaluating the sine function at a machine number near 31 is much smaller than the combination of errors resulting from first rounding a real number to the nearest machine number and then taking the sine of that machine number.
In[50]:=
Click for copyable input
Out[50]=
Here you see how bad the square root function really is. It appears that near its error is bounded by about ulps.
In[51]:=
Click for copyable input
Out[51]=
You cannot examine this function exactly at 1 since it is singular there. You can, however, examine it near 1. The label for is printed as because the label is limited to 6 characters.
In[52]:=
Click for copyable input
Out[52]=

One point needs to be made clear, since rounding errors are often misunderstood and blame is incorrectly placed on an algorithm that has nothing to do with the error. Algorithms are designed to map the set of machine numbers into the set of machine numbers. To consider other numbers does not make sense. As far as the computer is concerned there are no other numbers. So, for example, if you want to find the square root of , you might do it as follows.

Start with a good numerical approximation to .
In[53]:=
Click for copyable input
Out[53]=
This finds the nearest machine number to .
In[54]:=
Click for copyable input
Out[54]=
This finds the square root of the machine number using the machine-precision algorithm for square root.
In[55]:=
Click for copyable input
Out[55]=
Here the error in is measured in ulps. This result gives the error for the particular machine on which this example was run. Your machine may give a very different result.
In[56]:=
Click for copyable input
Out[56]=

Here the error is less than half an ulp. However, this error is due to both the square root function and the rounding error incurred in converting the number to the machine number . In fact, the square root function has an error of less than 0.1 ulps.

Here is the correct value for the square root of .
In[57]:=
Click for copyable input
Out[57]=
This shows the error in the square root function.
In[58]:=
Click for copyable input
Out[58]=
New to Mathematica? Find your learning path »
Have a question? Ask support »