XML
XML is a general data format that is becoming increasingly important. Data that is formatted in XML can readily be used by applications that are able to process it. In this case the choice of an XML format means that you will save considerable development effort. In addition there are an increasing number of existing data formats that use XML. Some of the more important for mathematical and scientific purposes include XHTML (an XML compliant version of HTML), MathML (a way to store mathematical information), and SVG (a graphics format). A large list of XML applications is available at
http://www.xml.org.
Mathematica contains a large number of features for working with XML, all of which are available in web
Mathematica. XML can be very useful for web
Mathematica with its support for specific XML applications and as a general format for data interchange. The use of
MathML,
SVG, and
XHTML will be covered in their own sections. This section will give an overview of XML and the XML features of
Mathematica. It will also give some examples of why this functionality is useful to web
Mathematica.
Introduction to XML
This section will give a very brief introduction to XML. For more information, go to one of the many references such as those detailed at
http://www.w3.org/XML/, for example,
http://www.w3.org/XML/1999/XML-in-10-points.
A sample XML document is shown below.
<?xml version="1.0"?>
<library>
<book>
<title>A New Kind of Science</title>
<author>Stephen Wolfram</author>
</book>
<book>
<title>The Lord of the Rings</title>
<author>J.R.R. Tolkien</author>
</book>
</library>
The example above shows a data format for a library. The library contains books and each book has a title and an author. This shows how XML is suitable for structured data. In addition, you can see how XML looks a little like HTML, except that the tags (words bracketed by '<' and '>') are not restricted to a fixed set since new tags, that are suitable for a particular application, can be introduced. Unlike HTML, the format of XML is stricter with a valid XML document being required to follow rules that do not apply to HTML. This is demonstrated in the
next section.
XML Compliance
One issue with XML is that documents must be wellformed, following the rules of XML. Some basic examples of compliance are described in this section.
An XML document must include a header. For example, it must start with something like the following.
Empty elements must either have an end tag, or the start tag must end with />. Thus, the following is legal.
However, this is not legal.
For nonempty tags, the end tag is required. Thus, the following is legal.
<p>Here is a paragraph.</p><p>Here is another.</p>
However, this is not legal.
<p>Here is a paragraph.<p>Here is another.
Mathematica Support for XML
Mathematica provides some very convenient ways to work with XML. Many of these are based on the strong correspondence between structured XML documents and
Mathematica expressions (the basic data type of
Mathematica). This makes it easy to import XML data into
Mathematica and then work with it. This section gives a very brief introduction to working with XML in
Mathematica; more information is available in the online documentation.
The following is a simple example.
This XML can be imported into
Mathematica, which represents it with symbolic XML. Because of the nature of
Mathematica expressions, symbolic XML is a
Mathematica native form of XML that is isomorphic to textual XML.
Out[2]= | |
You can use standard
Mathematica programming features to process symbolic XML; for example, to extract all the authors.
Out[3]= | |
Out[4]= | |
This outputs the new XML expression.
Out[5]= | |
This type of transformation can of course be done in other ways. For example, the use of XSLT stylesheet technology provides one way. However, there is an overhead to setting up an XSLT stylesheet to make the transformation. The use of
Mathematica, with its uniform programming principles, is often a quick and simple way to get the task carried out.
There are many more features of the
Mathematica XML tools, for example, working with attributes, entities, namespaces, validation, and CDATA. More information is available from the
Mathematica documentation.
webMathematica XML Applications
Many web
Mathematica applications involve generating HTML to be read by browsers. However, the output from a web
Mathematica site may not go to a browser; it may involve some data to be read by an application that will then do further processing. This section will study an example that shows how this can be done.
The source for this example is in
webMathematica/Examples/XML/Phone.jsp and
webMathematica/Examples/XML/Processed.jsp. It also uses an XML file
webMathematica/Examples/XML/phone.xml. If you installed web
Mathematica as described
above, you should be able to connect to this JSP via
http://localhost:8080/webMathematica/Examples/XML/Phone.jsp. (You may have some other URL for accessing your server.)
<?xml version="1.0"?>
<EmployeeList>
<Person Name="Tom Jones" Email="tomj" Phone="235-1231" />
<Person Name="Janet Rogers" Email="jrogers" Phone="235-1129" />
<Person Name="Bob Norris" Email="bobn" Phone="235-1237" />
<Person Name="Kit Smithers" Email="ksmit" Phone="235-0729" />
<Person Name="Jamie Lemay" Email="jlemay" Phone="235-6393" />
</EmployeeList>
The contents of
Processed.jsp are shown below.
<%@ page contentType="text/xml"%>
<%@ taglib uri="http://www.wolfram.com/msp" prefix="msp" %>
<msp:evaluate>
xml = Import[ToFileName[MSPPageDirectory[], "phone.xml"], "XML"] ;
xml = First[Cases[xml, _XMLElement]];
If[MSPValueQ[$$patt],
xml = DeleteCases[xml,
XMLElement["Person", {___,
"Name"->n_/;!StringMatchQ[n, $$patt], ___}, _], Infinity]
];
ExportString[xml, "XML"]
</msp:evaluate>
This example first imports the XML file into
Mathematica. It uses the command
MSPPageDirectory because the XML data is located in the same directory as
Processed.jsp. It then checks to see if a parameter
name was sent. If this is the case, then it uses this to discard XML elements that do not match this name. You should be able to see the operation of this parameter with a URL such as
http://localhost:8080/webMathematica/Examples/XML/Processed.jsp?name=T. (You may have some other URL for accessing your server.) It ends by converting the symbolic XML into a string version of the XML and returning this.
Of course, you may want to use this XML data for further processing. If you have a system that is XML-aware, this is quite straightforward. One useful application that is XML-aware is of course
Mathematica. For example, the following will call your web
Mathematica site and retrieve the information.
Out[1]= | |
You may even wish to use this in a
Mathematica program.
Out[3]= | |
Of course your client could be written in some system other than
Mathematica, such as Visual Basic, Python, or Java.
Using XML as an interchange format for communication between two programs is discussed in more detail in the section on
web services.