This is documentation for Mathematica 6, which was
based on an earlier version of the Wolfram Language.
View current documentation (Version 11.1)

RegularExpression

RegularExpression["regex"]
represents the generalized regular expression specified by the string "regex".
  • RegularExpression supports standard regular expression syntax, of the kind used in typical string manipulation languages.
  • The following basic elements can be used in regular expression strings:
cthe literal character c
.any character except newline
[c1c2...]any of the characters ci
[c1-c2]any character in the range c1-c2
[^c1c2...]any character except the ci
p*p repeated zero or more times
p+p repeated one or more times
p?zero or one occurrence of p
p{m,n}p repeated between m and n times
p*?,p+?,p??the shortest consistent strings that match
(p1p2...)strings matching the sequence p1, p2, ...
p1|p2strings matching p1 or p2
  • The following represent classes of characters:
\\ddigit 0-9
\\Dnondigit
\\sspace, newline, tab or other whitespace character
\\Snonwhitespace character
\\wword character (letter, digit or _)
\\Wnonword character
[[:class:]]characters in a named class
[^[:class:]]characters not in a named class
  • The following named classes can be used: alnum, alpha, ascii, blank, cntrl, digit, graph, lower, print, punct, space, upper, word, xdigit.
  • The following represent positions in strings:
^the beginning of the string (or line)
$the end of the string (or line)
\\bword boundary
\\Banywhere except a word boundary
  • The following set options for all regular expression elements that follow them:
(?i)treat upper and lower case as equivalent (ignore case)
(?m)make ^ and $ match start and end of lines (multiline mode)
(?s)allow . to match newline
(?-c)unset options
  • \\., \\[, etc. represent literal characters ., [, etc.
  • Analogs of named Mathematica patterns such as x:expr can be set up in regular expression strings using (regex).
  • Within a regular expression string, \\n represents the substring matched by the n^(th) parenthesized regular expression object (regex).
  • For the purpose of functions such as StringReplace and StringCases, any $n appearing in the right-hand side of a rule RegularExpression["regex"]->rhs is taken to correspond to the substring matched by the n^(th) parenthesized regular expression object in regex. $0 represents the whole matched string.
New in 5.1