Class SyntaxTokenizer

java.lang.Object
com.wolfram.jlink.ui.SyntaxTokenizer

public class SyntaxTokenizer
extends java.lang.Object
A utility class that can break up Mathematica code into 4 syntax classes: strings, comments, symbols, and normal (meaning everything else). This class is used by MathSessionPane to implement its syntax coloring feature, but you can use it directly in your own programs.

To use a SyntaxTokenizer, construct one and then call its setText() method, supplying the Mathematica input you want tokenized. You then call getNextRecord() repeatedly to get SyntaxRecords, which tell you the type of syntax element and the length in characters.

This process is very fast. You can iterate through 100,000 characters of Mathematica code in a small fraction of a second

Here is some sample code that demonstrates how to use a SyntaxTokenizer:

        String input = "some Mathematica code here";
        SyntaxTokenizer tok = new SyntaxTokenizer();
        tok.setText(input);
        while(tok.hasMoreRecords()) {
                SyntaxTokenizer.SyntaxRecord rec = tok.getNextRecord();
                System.out.println("type: " + rec.type);
                System.out.println("text: " + input.substring(rec.start, rec.start + rec.length));
        }
Since:
2.0
See Also:
MathSessionPane
  • Nested Class Summary

    Nested Classes 
    Modifier and Type Class Description
    class  SyntaxTokenizer.SyntaxRecord
    A simple class the encapsulates information about a syntax element.
  • Field Summary

    Fields 
    Modifier and Type Field Description
    static int COMMENT
    A syntax type that corresponds to a Mathematica comment.
    static int NORMAL
    A syntax type that consists of everything other than STRING, COMMENT, or SYMBOL.
    static int STRING
    A syntax type that corresponds to a literal string.
    static int SYMBOL
    A syntax type that corresponds to a Mathematica symbol.
  • Constructor Summary

    Constructors 
    Constructor Description
    SyntaxTokenizer()  
  • Method Summary

    Modifier and Type Method Description
    SyntaxTokenizer.SyntaxRecord getNextRecord()
    Gets the next SyntaxRecord specifying the type of the element (SYMBOL, STRING, COMMENT or NORMAL), its start position, and length.
    boolean hasMoreRecords()
    Returns true or false to indicate whether there are any more records left in the text (i.e., whether we have come to the end of the input).
    void reset()
    Resets the state of the tokenizer so that the next call to getNextRecord() will retrieve the first record in the text.
    void setText​(java.lang.String text)
    Sets the Mathematica input text to tokenize.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

  • Constructor Details

  • Method Details

    • setText

      public void setText​(java.lang.String text)
      Sets the Mathematica input text to tokenize.
      Parameters:
      text -
    • reset

      public void reset()
      Resets the state of the tokenizer so that the next call to getNextRecord() will retrieve the first record in the text.
    • getNextRecord

      public SyntaxTokenizer.SyntaxRecord getNextRecord()
      Gets the next SyntaxRecord specifying the type of the element (SYMBOL, STRING, COMMENT or NORMAL), its start position, and length.
    • hasMoreRecords

      public boolean hasMoreRecords()
      Returns true or false to indicate whether there are any more records left in the text (i.e., whether we have come to the end of the input).