The Software Engineering of the Wolfram System

The Wolfram System is one of the more complex software systems ever constructed. It is built from several million lines of source code, written in C/C++, Java, and the Wolfram Language.

The C code in the Wolfram System is actually written in a custom extension of C which supports certain memory management and objectoriented features. The Wolfram Language code is optimized using Share and DumpSave.

In the Wolfram Language kernel the breakdown of different parts of the code is roughly as follows: language and system: 30%; numerical computation: 20%; algebraic computation: 20%; graphics and kernel output: 30%.

Most of this code is fairly dense and algorithmic: those parts that are in effect simple procedures or tables use minimal code since they tend to be written at a higher leveloften directly in the Wolfram System.

The source code for the kernel, save a fraction of a percent, is identical for all computer systems on which the Wolfram System runs.

For the front end, however, a significant amount of specialized code is needed to support each different type of user interface environment. The front end contains about 700,000 lines of systemindependent C++ source code, of which roughly 200,000 lines are concerned with expression formatting. Then there are between 50,000 and 100,000 lines of specific code customized for each user interface environment.

The Wolfram System uses a clientserver model of computing. The front end and kernel are connected via the Wolfram Symbolic Transfer Protocol (WSTP)the same system as is used to communicate with other programs. WSTP supports multiple transport layers, including one based upon TCP/IP and one using shared memory.

The front end and kernel are connected via three independent WSTP connections. One is used for user-initiated evaluations. A second is used by the front end to resolve the values of Dynamic expressions. The third is used by the kernel to notify the front end of Dynamic objects which should be invalidated.

Within the C code portion of the Wolfram Language kernel, modularity and consistency are achieved by having different parts communicate primarily by exchanging complete Wolfram System expressions.

But it should be noted that even though different parts of the system are quite independent at the level of source code, they have many algorithmic interdependencies. Thus, for example, it is common for numerical functions to make extensive use of algebraic algorithms, or for graphics code to use fairly advanced mathematical algorithms embodied in quite different Wolfram System functions.

Since the beginning of its development in 1986, the effort spent directly on creating the source code for the Wolfram System is about a thousand developeryears. In addition, a comparable or somewhat larger effort has been spent on testing and verification.

The source code of the Wolfram System has changed greatly since Version 1 was released. The total number of lines of code in the kernel grew from 150,000 in Version 1 to 350,000 in Version 2, 600,000 in Version 3, 800,000 in Version 4, 1.5 million in Version 5, and 2.5 million in Version 6. In addition, at every stage existing code has been revisedso that Version 6 has only a small percent of its code in common with Version 1.

Despite these changes in internal code, however, the userlevel design of the Wolfram System has remained compatible from Version 1 on. Much functionality has been added, but programs created for the Wolfram System Version 1 will almost always run absolutely unchanged under Version 6.