This is documentation for Mathematica 3, which was
based on an earlier version of the Wolfram Language.
View current documentation (Version 11.1)
 Documentation / Mathematica / The Mathematica Book / Principles of Mathematica / Files and Streams  /

2.11.8 Searching Files


Finding lines that contain specified text.

  • Here is a file containing some text.
  • In[1]:= !!textfile

    Here is the first line of text.
    And the second.
    And the third. Here is the end.

  • This returns a list of all the lines in the file containing the text is.
  • In[2]:= FindList["textfile", "is"]

    Out[2]=

  • The text fourth appears nowhere in the file.
  • In[3]:= FindList["textfile", "fourth"]

    Out[3]=

    By default, FindList scans successive lines of a file, and returns those lines which contain the text you specify. In general, however, you can get FindList to scan successive records, and return complete records which contain specified text. As in ReadList, the option RecordSeparators allows you to tell Mathematica what strings you want to consider as record separators. Note that by giving a pair of lists as the setting for RecordSeparators, you can specify different left and right separators. By doing this, you can make FindList search only for text which is between specific pairs of separators.

  • This finds all "sentences" ending with a period which contain And.
  • In[4]:= FindList["textfile", "And", RecordSeparators -> {"."}]

    Out[4]=


    Options for FindList.

  • This finds only the occurrence of Here which is at the beginning of a line in the file.
  • In[5]:= FindList["textfile", "Here", AnchoredSearch -> True]

    Out[5]=

    In general, FindList finds text that appears anywhere inside a record. By setting the option WordSearch->True, however, you can tell FindList to require that the text it is looking for appears as a separate word in the record. The option WordSeparators specifies the list of separators for words.

  • The text th does appear in the file, but not as a word. As a result, the FindList fails.
  • In[6]:= FindList["textfile", "th", WordSearch -> True]

    Out[6]=


    Searching in multiple files.

  • This searches for third in two copies of textfile.
  • In[7]:= FindList[{"textfile", "textfile"}, "third"]

    Out[7]=

    It is often useful to call FindList on lists of files generated by functions such as FileNames.


    Finding text in the output from an external program.

  • This runs the external Unix command date.
  • In[8]:= !date

    Thu Mar 13 18:23:41 CST 1997

    Out[8]=

  • This finds the time-of-day field in the date.
  • In[9]:= FindList["!date", ":", RecordSeparators -> {" "}]

    Out[9]=


    Finding successive occurrences of text.

    FindList works by making one pass through a particular file, looking for occurrences of the text you specify. Sometimes, however, you may want to search incrementally for successive occurrences of a piece of text. You can do this using Find.
    In order to use Find, you first explicitly have to open an input stream using OpenRead. Then, every time you call Find on this stream, it will search for the text you specify, and make the current point in the file be just after the record it finds. As a result, you can call Find several times to find successive pieces of text.

  • This opens an input stream for textfile.
  • In[10]:= stext = OpenRead["textfile"]

    Out[10]=

  • This finds the first line containing And.
  • In[11]:= Find[stext, "And"]

    Out[11]=

  • Calling Find again gives you the next line containing And.
  • In[12]:= Find[stext, "And"]

    Out[12]=

  • This closes the input stream.
  • In[13]:= Close[stext]

    Out[13]=

    Once you have an input stream, you can mix calls to Find, Skip and Read. If you ever call FindList or ReadList, Mathematica will immediately read to the end of the input stream.

  • This opens the input stream.
  • In[14]:= stext = OpenRead["textfile"]

    Out[14]=

  • This finds the first line which contains second, and leaves the current point in the file at the beginning of the next line.
  • In[15]:= Find[stext, "second"]

    Out[15]=

  • Read can then read the word that appears at the beginning of the line.
  • In[16]:= Read[stext, Word]

    Out[16]=

  • This skips over the next three words.
  • In[17]:= Skip[stext, Word, 3]

  • Mathematica finds is in the remaining text, and prints the entire record as output.
  • In[18]:= Find[stext, "is"]

    Out[18]=

  • This closes the input stream.
  • In[19]:= Close[stext]

    Out[19]=


    Finding and setting the current point in a stream.

    Functions like Read, Skip and Find usually operate on streams in an entirely sequential fashion. Each time one of the functions is called, the current point in the stream moves on.
    Sometimes, you may need to know where the current point in a stream is, and be able to reset it. On most computer systems, StreamPosition returns the position of the current point as an integer giving the number of bytes from the beginning of the stream.

  • This opens the stream.
  • In[20]:= stext = OpenRead["textfile"]

    Out[20]=

  • When you first open the file, the current point is at the beginning, and StreamPosition returns 0.
  • In[21]:= StreamPosition[stext]

    Out[21]=

  • This reads the first line in the file.
  • In[22]:= Read[stext, Record]

    Out[22]=

  • Now the current point has advanced.
  • In[23]:= StreamPosition[stext]

    Out[23]=

  • This sets the stream position back.
  • In[24]:= SetStreamPosition[stext, 5]

    Out[24]=

  • Now Read returns the remainder of the first line.
  • In[25]:= Read[stext, Record]

    Out[25]=

  • This closes the stream.
  • In[26]:= Close[stext]

    Out[26]=