Reading Textual Data

With , you can read files that contain Wolfram Language expressions given in input form. Sometimes, however, you may instead need to read files of data in other formats. For example, you may have data generated by an external program which consists of a sequence of numbers separated by spaces. This data cannot be read directly as Wolfram Language input. However, the function ReadList can take such data from a file or input stream, and convert it to a Wolfram Language list.

ReadList["file",Number]read a sequence of numbers from a file, and put them in a Wolfram Language list

Reading numbers from a file.

Here is a file of numbers.
This reads all the numbers in the file, and returns a list of them.
In[2]:=
Click for copyable input
Out[2]=
ReadList["file",{Number,Number}]read numbers from a file, putting each successive pair into a separate list
ReadList["file",Table[Number,{n}]]put each successive block of n numbers in a separate list
ReadList["file",Number,RecordLists->True]
put all the numbers on each line of the file into a separate list

Reading blocks of numbers.

This puts each successive pair of numbers from the file into a separate list.
In[3]:=
Click for copyable input
Out[3]=
This makes each line in the file into a separate list.
In[4]:=
Click for copyable input
Out[4]=

ReadList can handle numbers that are given in Fortranlike "E" notation. Thus, for example, ReadList will read 2.5E+5 as . Note that ReadList can handle numbers with any number of digits of precision.

Here is a file containing numbers in Fortranlike "E" notation.
ReadList can handle numbers in this form.
In[6]:=
Click for copyable input
Out[6]=
ReadList["file",type]read a sequence of objects of a particular type
ReadList["file",type,n]read at most n objects

Reading objects of various types.

ReadList can read not only numbers, but also a variety of other types of object. Each type of object is specified by a symbol such as Number.

Here is a file containing text.
This produces a list of the characters in the file, each given as a onecharacter string.
In[8]:=
Click for copyable input
Out[8]=
Here are the integer codes corresponding to each of the bytes in the file.
In[9]:=
Click for copyable input
Out[9]=
This puts the data from each line in the file into a separate list.
In[10]:=
Click for copyable input
Out[10]=
Bytesingle byte of data, returned as an integer
Charactersingle character, returned as a onecharacter string
Realapproximate number in Fortranlike notation
Numberexact or approximate number in Fortranlike notation
Wordsequence of characters delimited by word separators
Recordsequence of characters delimited by record separators
Stringstring terminated by a newline
Expressioncomplete Wolfram Language expression
Hold[Expression]complete Wolfram Language expression, returned inside Hold

Types of objects to read.

This returns a list of the "words" in the file .
In[11]:=
Click for copyable input
Out[11]=

ReadList allows you to read "words" from a file. It considers a "word" to be any sequence of characters delimited by word separators. You can set the option WordSeparators to specify the strings you want to treat as word separators. The default is to include spaces and tabs, but not to include, for example, standard punctuation characters. Note that in all cases successive words can be separated by any number of word separators. These separators are never taken to be part of the actual words returned by ReadList.

option name
default value
RecordListsFalsewhether to make a separate list for the objects in each record
RecordSeparators{"\r\n", "\n","\r"}separators for records
WordSeparators{" ","t"}separators for words
NullRecordsFalsewhether to keep zerolength records
NullWordsFalsewhether to keep zerolength words
TokenWords{}words to take as tokens

Options for ReadList.

This reads the text in the file as a sequence of words, using the letter and as word separators.
In[12]:=
Click for copyable input
Out[12]=

The Wolfram Language considers any data file to consist of a sequence of records. By default, each line is considered to be a separate record. In general, you can set the option RecordSeparators to give a list of separators for records. Note that words can never cross record separators. As with word separators, any number of record separators can exist between successive records, and these separators are not considered to be part of the records themselves.

By default, each line of the file is considered to be a record.
Here is a file containing three "sentences" ending with periods.
This allows both periods and newlines as record separators.
In[15]:=
Click for copyable input
Out[15]=
This puts the words in each "sentence" into a separate list.
In[16]:=
Click for copyable input
Out[16]=
ReadList["file",Record,RecordSeparators->{}]
read the whole of a file as a single string
ReadList["file",Record,RecordSeparators->{{"lsep1",},{"rsep1",}}]
make a list of those parts of a file that lie between the and the

Settings for the RecordSeparators option.

Here is a file containing some text.
This reads all the text in the file and returns it as a single string.
This gives a list of the parts of the file that lie between and separators.
In[19]:=
Click for copyable input
Out[19]=
By choosing appropriate separators, you can pick out specific parts of files.
In[20]:=
Click for copyable input
Out[20]=

The Wolfram Language usually allows any number of appropriate separators to appear between successive records or words. Sometimes, however, when several separators are present, you may want to assume that a "null record" or "null word" appears between each pair of adjacent separators. You can do this by setting the options NullRecords->True or NullWords->True.

Here is a file containing "words" separated by colons.
Here the repeated colons are treated as single separators.
In[22]:=
Click for copyable input
Out[22]=
Now repeated colons are taken to have null words in between.
In[23]:=
Click for copyable input
Out[23]=

In most cases, you want words to be delimited by separators that are not themselves considered as words. Sometimes, however, it is convenient to allow words to be delimited by special "token words", which are themselves words. You can give a list of such token words as a setting for the option TokenWords.

Here is some text.
This reads the text, using the specified token words to delimit words in the text.
In[25]:=
Click for copyable input
Out[25]=

You can use ReadList to read Wolfram Language expressions from files. In general, each expression must end with a newline, although a single expression may go on for several lines.

Here is a file containing text that can be used as Wolfram Language input.
This reads the text in as Wolfram Language expressions.
In[27]:=
Click for copyable input
Out[27]=
This prevents the expressions from being evaluated.
In[28]:=
Click for copyable input
Out[28]=

ReadList can insert the objects it reads into any Wolfram Language expression. The second argument to ReadList can consist of any expression containing symbols such as Number and Word specifying objects to read. Thus, for example, ReadList["file",{Number,Number}] inserts successive pairs of numbers that it reads into lists. Similarly, ReadList["file",Hold[Expression]] puts expressions that it reads inside Hold.

If ReadList reaches the end of your file before it has finished reading a particular set of objects you have asked for, then it inserts the special symbol EndOfFile in place of the objects it has not yet read.

Here is a file of numbers.
The symbol EndOfFile appears in place of numbers that were needed after the end of the file was reached.
In[30]:=
Click for copyable input
Out[30]=
ReadList["!command",type]execute a command, and read its output
ReadList[stream,type]read any input stream

Reading from commands and streams.

This executes the Unix command , and reads its output as a string.
In[31]:=
Click for copyable input
Out[31]=
OpenRead["file"]open a file for reading
OpenRead["!command"]open a pipe for reading
Read[stream,type]read an object of the specified type from a stream
Skip[stream,type]skip over an object of the specified type in an input stream
Skip[stream,type,n]skip over n objects of the specified type in an input stream
Close[stream]close an input stream

Functions for reading from input streams.

ReadList allows you to read all the data in a particular file or input stream. Sometimes, however, you want to get data a piece at a time, perhaps doing tests to find out what kind of data to expect next.

When you read individual pieces of data from a file, the Wolfram Language always remembers the "current point" that you are at in the file. When you call OpenRead, the Wolfram Language sets up an input stream from a file, and makes your current point the beginning of the file. Every time you read an object from the file using Read, the Wolfram Language sets your current point to be just after the object you have read. Using Skip, you can advance the current point past a sequence of objects without actually reading the objects.

Here is a file of numbers.
This opens an input stream from the file.
In[33]:=
Click for copyable input
Out[33]=
This reads the first number from the file.
In[34]:=
Click for copyable input
Out[34]=
This reads the second pair of numbers.
In[35]:=
Click for copyable input
Out[35]=
This skips the next number.
In[36]:=
Click for copyable input
This reads the remaining numbers.
In[37]:=
Click for copyable input
Out[37]=
This closes the input stream.
In[38]:=
Click for copyable input
Out[38]=

You can use the options WordSeparators and RecordSeparators in Read and Skip just as you do in ReadList.

Note that if you try to read past the end of file, Read returns the symbol EndOfFile.