1.12.4 The Software Engineering of Mathematica
Mathematica is one of the more complex software systems ever constructed. Its source code is written in a combination of C and Mathematica, and for Version 5, the code for the kernel consists of about 1.5 million lines of C and 150,000 lines of Mathematica. This corresponds to roughly 50 megabytes of data, or some 50,000 printed pages.
The C code in Mathematica is actually written in a custom extension of C which supports certain memory management and object-oriented features. The Mathematica code is optimized using Share and DumpSave.
In the Mathematica kernel the breakdown of different parts of the code is roughly as follows: language and system: 30%; numerical computation: 25%; algebraic computation: 25%; graphics and kernel output: 20%.
Most of this code is fairly dense and algorithmic: those parts that are in effect simple procedures or tables use minimal code since they tend to be written at a higher level—often directly in Mathematica.
The source code for the kernel, save a fraction of a percent, is identical for all computer systems on which Mathematica runs.
For the front end, however, a significant amount of specialized code is needed to support each different type of user interface environment. The front end contains about 650,000 lines of system-independent C source code, of which roughly 150,000 lines are concerned with expression formatting. Then there are between 50,000 and 100,000 lines of specific code customized for each user interface environment.
Mathematica uses a client-server model of computing. The front end and kernel are connected via MathLink—the same system as is used to communicate with other programs.
Within the C code portion of the Mathematica kernel, modularity and consistency are achieved by having different parts communicate primarily by exchanging complete Mathematica expressions.
But it should be noted that even though different parts of the system are quite independent at the level of source code, they have many algorithmic interdependencies. Thus, for example, it is common for numerical functions to make extensive use of algebraic algorithms, or for graphics code to use fairly advanced mathematical algorithms embodied in quite different Mathematica functions.
Since the beginning of its development in 1986, the effort spent directly on creating the source code for Mathematica is a substantial fraction of a thousand man-years. In addition, a comparable or somewhat larger effort has been spent on testing and verification.
The source code of Mathematica has changed greatly since Version 1 was released. The total number of lines of code in the kernel grew from 150,000 in Version 1 to 350,000 in Version 2, 600,000 in Version 3, 800,000 in Version 4 and about 1.5 million in Version 5. In addition, at every stage existing code has been revised—so that Version 5 has only a few percent of its code in common with Version 1.
Despite these changes in internal code, however, the user-level design of Mathematica has remained compatible from Version 1 on. Much functionality has been added, but programs created for Mathematica Version 1 will almost always run absolutely unchanged under Version 5.