StringCases
StringCases["string",patt]
gives a list of the substrings in "string" that match the string expression patt.
StringCases["string",lhsrhs]
gives a list of the values of rhs corresponding to the substrings that match the string expression lhs.
StringCases["string",p,n]
includes only the first n substrings that match.
StringCases["string",{p1,p2,…}]
gives substrings that match any of the pi.
StringCases[{s1,s2,…},p]
gives the list of results for each of the si.
StringCases[patt]
represents an operator form of StringCases that can be applied to an expression.
Details and Options
- String expressions can contain any of the objects specified in the notes for StringExpression.
- With the default option setting Overlaps->False, StringCases includes only substrings that do not overlap. With Overlaps->True, it includes substrings that overlap.
- With Overlaps->All, multiple substrings that match the same string expression are all included. With Overlaps->True, only the first such matching substring at a given position is included.
- Setting the option IgnoreCase->True makes StringCases treat lowercase and uppercase letters as equivalent.
- StringCases["string",RegularExpression["regex"]] gives substrings matching the specified regular expression.
- StringCases[s,lhs:>rhs] evaluates rhs only when the pattern is found.
- StringCases[patt][expr] is equivalent to StringCases[expr, patt].
- StringCases[BioSequence["type","seq"],patt,…] finds cases of patt in the string "seq" yielding a list of biomolecular sequences. In this case, degenerate letters in patt are interpreted as wildcard patterns based on the type of biomolecular sequence. Use Verbatim["patt"] to match degenerate letters literally.
- The documentation for BioSequence lists the degenerate letters supported by each type of biomolecular sequence.
- If the biomolecular sequence operated upon by StringCases is circular, wraparound matches are possible.
Background & Context
- StringCases["string",{patt1,…,pattk}] returns a list whose elements are the substrings of "string" matching any of the patterns pattj. The alternative form StringCases["string",patt] is equivalent to StringCases["string",{patt}], while StringCases["string",lhsrhs] gives a list of the values of rhs corresponding to the substrings of "string" matching lhs, StringCases["string",patt,n] includes only the first n substrings that match, StringCases["string",RegularExpression["regex"]] gives substrings matching the specified regular expression regex, and StringCases[patt]["string"] represents an operator form of StringCases applied to string. StringCases automatically threads over lists, with StringCases[{"string1",…,"stringn"},…] returning the list of results for each "stringj".
- The string patterns pattj may contain any valid StringExpression objects, such as AnyOrder, FixedOrder, Condition, Whitespace, NumberString and DatePattern.
- The default behavior of StringCases is equivalent to StringCases[…,IgnoreCase→False,Overlaps→False], namely to consider lowercase and uppercase letters distinct and to omit substrings that overlap. In contrast, specifying the option IgnoreCase→True makes StringCases treat lowercase and uppercase letters as equivalent. The Overlaps option has a number of possible settings, with Overlaps->All returning all multiple substrings matching the same string expression, and Overlaps->True returning only the first matching substring at a given position.
- StringCases is related to a number of other symbols. StringCount and StringPosition give the number of substrings of a given string that match a particular pattern and the positions of matching substrings, respectively, and StringCount["string",patt] and StringTake["string",StringPosition["string",patt]] are equivalent to Length[StringCases["string",patt]] and StringCases["string",patt], respectively. StringContainsQ and StringFreeQ test whether a given string contains or fails to contain a substring matching a particular pattern, respectively. In particular, StringContainsQ["string",patt] returns False if and only if StringCases["string",patt] returns the empty list {} and StringFreeQ["string",patt] returns True. StringCases is the String analog of the qualitatively similar functions TextCases and Cases and is also related to TextWords, StringReplace, StringReplaceList and StringReplacePart.
Examples
open allclose allBasic Examples (3)
Find the substrings matching a pattern:
Return only the named wildcard character in each substring:
Use the operator form of StringCases:
Scope (11)
Use pattern matching for dates:
Mixed regular expressions and string patterns:
Rules to extract values corresponding to matching substrings:
Include only the two first strings that match:
Occurrences in either substring:
StringCases automatically threads over lists of strings:
Find codon-length subsequences in a DNA sequence:
Use a wildcard in the pattern found in a given biomolecular sequence:
The "Y" is a degenerate letter and is not a wildcard except in biomolecular sequences:
Additional wraparound matches may be found in circular biomolecular sequences:
Match only literal degenerate letter occurrences using Verbatim:
Options (3)
Applications (3)
Properties & Relations (2)
StringCount gives the number of matching substrings:
The length of matching substrings:
Use StringPosition to get the position of matching substrings:
Possible Issues (1)
Text
Wolfram Research (2004), StringCases, Wolfram Language function, https://reference.wolfram.com/language/ref/StringCases.html (updated 2020).
CMS
Wolfram Language. 2004. "StringCases." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2020. https://reference.wolfram.com/language/ref/StringCases.html.
APA
Wolfram Language. (2004). StringCases. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/StringCases.html