StringCount
StringCount["string","sub"]
gives a count of the number of times "sub" appears as a substring of "string".
StringCount["string",patt]
gives the number of substrings in "string" that match the general string expression patt.
StringCount["string",{patt1,patt2,…}]
counts the number of occurrences of any of the patti.
StringCount[{s1,s2,…},p]
gives the list of results for each of the si.
Details and Options
- The string expression patt can contain any of the objects specified in the notes for StringExpression.
- With the default option setting Overlaps->False, overlapping substrings are not treated as separate. With the setting Overlaps->True, StringCount counts substrings that overlap as separate.
- With Overlaps->All, multiple substrings that match the same string expression are all counted as separate. With Overlaps->True, only the first such matching substring at a given position is counted as separate.
- Setting the option IgnoreCase->True makes StringCount treat lowercase and uppercase letters as equivalent.
- StringCount["string",RegularExpression["regex"]] gives the number of substrings matching the specified regular expression.
- StringCount[BioSequence["type","seq"],patt] counts the matches of patt in the string "seq". In this case, degenerate letters in patt are interpreted as wildcard patterns based on the type of biomolecular sequence. Use Verbatim["patt"] to match degenerate letters literally.
- The documentation for BioSequence lists the degenerate letters supported by each type of biomolecular sequence.
- If the biomolecular sequence operated upon by StringCount is circular, wraparound matches are possible.
Examples
open allclose allBasic Examples (2)
Scope (8)
Mixed regular expressions and string patterns:
Count occurrences of either substrings:
StringCount automatically threads over lists of strings:
Count codon-length subsequences in a DNA sequence:
Use a wildcard in the pattern counted in a given biomolecular sequence:
The "Y" is a degenerate letter and not a wildcard except in biomolecular sequences:
Additional wraparound matches may be found in circular biomolecular sequences:
Count only literal degenerate letter occurrences using Verbatim:
Options (3)
Overlaps (2)
All substrings in "the cat in the hat" starting and ending with "t":
StringCount does not include overlaps by default:
This includes overlaps starting at the same position:
Applications (3)
Properties & Relations (1)
StringCount gives the number of matching substrings:
The length of matching substrings obtained from StringCases:
Text
Wolfram Research (2004), StringCount, Wolfram Language function, https://reference.wolfram.com/language/ref/StringCount.html (updated 2020).
CMS
Wolfram Language. 2004. "StringCount." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2020. https://reference.wolfram.com/language/ref/StringCount.html.
APA
Wolfram Language. (2004). StringCount. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/StringCount.html