RegularExpression

RegularExpression["regex"]
represents the generalized regular expression specified by the string .

DetailsDetails

  • RegularExpression can be used to represent classes of strings in functions like StringMatchQ, StringReplace, StringCases, and StringSplit.
  • RegularExpression supports standard regular expression syntax of the kind used in typical string manipulation languages.
  • The following basic elements can be used in regular expression strings:
  • cthe literal character c
    .any character except newline
    [c1c2]any of the characters
    [c1-c2]any character in the range
    [^c1c2]any character except the
    p*p repeated zero or more times
    p+p repeated one or more times
    p?zero or one occurrence of p
    p{m,n}p repeated between m and n times
    p*?,p+?,p??the shortest consistent strings that match
    (p1p2)strings matching the sequence , ,
    p1|p2strings matching or
  • The following represent classes of characters:
  • \\ddigit 09
    \\Dnondigit
    \\sspace, newline, tab, or other whitespace character
    \\Snon-whitespace character
    \\wword character (letter, digit, or )
    \\Wnonword character
    [[:class:]]characters in a named class
    [^[:class:]]characters not in a named class
  • The following named classes can be used: , , , , , , , , , , , , , .
  • The following represent positions in strings:
  • ^the beginning of the string (or line)
    $the end of the string (or line)
    \\bword boundary
    \\Banywhere except a word boundary
  • The following set options for all regular expression elements that follow them:
  • (?i)treat uppercase and lowercase as equivalent (ignore case)
    (?m)make and match start and end of lines (multiline mode)
    (?s)allow to match newline
    (?-c)unset options
  • , , etc. represent literal characters , , etc.
  • Analogs of named Wolfram Language patterns such as can be set up in regular expression strings using (regex).
  • Within a regular expression string, \\n represents the substring matched by the ^(th) parenthesized regular expression object (regex).
  • For the purpose of functions such as StringReplace and StringCases, any appearing in the righthand side of a rule RegularExpression["regex"]->rhs is taken to correspond to the substring matched by the ^(th) parenthesized regular expression object in regex. represents the whole matched string.

ExamplesExamplesopen allclose all

Basic Examples  (2)Basic Examples  (2)

Find words involving the characters a, b, c, d, e:

In[1]:=
Click for copyable input
Out[1]=

Equivalent form using string patterns:

In[2]:=
Click for copyable input
Out[2]=

Decide whether the string consists of words and whitespace:

In[1]:=
Click for copyable input
Out[1]=

Equivalent form using string patterns:

In[2]:=
Click for copyable input
Out[2]=
Introduced in 2004
(5.1)