StringCount

StringCount["string","sub"]

gives a count of the number of times "sub" appears as a substring of "string".

StringCount["string",patt]

gives the number of substrings in "string" that match the general string expression patt.

StringCount["string",{patt1,patt2,}]

counts the number of occurrences of any of the patti.

StringCount[{s1,s2,},p]

gives the list of results for each of the si.

Details and Options

  • The string expression patt can contain any of the objects specified in the notes for StringExpression.
  • With the default option setting Overlaps->False, overlapping substrings are not treated as separate. With the setting Overlaps->True, StringCount counts substrings that overlap as separate.
  • With Overlaps->All, multiple substrings that match the same string expression are all counted as separate. With Overlaps->True, only the first such matching substring at a given position is counted as separate.
  • Setting the option IgnoreCase->True makes StringCount treat lowercase and uppercase letters as equivalent.
  • StringCount["string",RegularExpression["regex"]] gives the number of substrings matching the specified regular expression.
  • StringCount[BioSequence["type","seq"],patt] counts the matches of patt in the string "seq". In this case, degenerate letters in patt are interpreted as wildcard patterns based on the type of biomolecular sequence. Use Verbatim["patt"] to match degenerate letters literally.
  • The documentation for BioSequence lists the degenerate letters supported by each type of biomolecular sequence.
  • If the biomolecular sequence operated upon by StringCount is circular, wraparound matches are possible.

Examples

open allclose all

Basic Examples  (2)

The number of occurrences of "bb" in the string "abbaabbaa":

Count the number of substrings of the form "axc" for different x characters:

Scope  (8)

Use string patterns:

Regular expressions:

Mixed regular expressions and string patterns:

Count occurrences of either substrings:

StringCount automatically threads over lists of strings:

Count codon-length subsequences in a DNA sequence:

Use a wildcard in the pattern counted in a given biomolecular sequence:

The "Y" is a degenerate letter and not a wildcard except in biomolecular sequences:

Additional wraparound matches may be found in circular biomolecular sequences:

Count only literal degenerate letter occurrences using Verbatim:

Options  (3)

IgnoreCase  (1)

The number of occurrences of "a" in "abAB":

Ignore case:

Overlaps  (2)

All substrings in "the cat in the hat" starting and ending with "t":

StringCount does not include overlaps by default:

This includes the overlaps:

This includes overlaps starting at the same position:

Count subsequences in a circular DNA sequence:

Allow overlaps between the subsequences:

Applications  (3)

A 10-million-base random DNA string:

The number of sequences with adenine symmetrically placed:

Find how many words occur in the US Constitution:

The number of occurrences of the word "president":

All strings made of two characters with length 4 and which overlap themselves:

Properties & Relations  (1)

StringCount gives the number of matching substrings:

The length of matching substrings obtained from StringCases:

Wolfram Research (2004), StringCount, Wolfram Language function, https://reference.wolfram.com/language/ref/StringCount.html (updated 2020).

Text

Wolfram Research (2004), StringCount, Wolfram Language function, https://reference.wolfram.com/language/ref/StringCount.html (updated 2020).

CMS

Wolfram Language. 2004. "StringCount." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2020. https://reference.wolfram.com/language/ref/StringCount.html.

APA

Wolfram Language. (2004). StringCount. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/StringCount.html

BibTeX

@misc{reference.wolfram_2024_stringcount, author="Wolfram Research", title="{StringCount}", year="2020", howpublished="\url{https://reference.wolfram.com/language/ref/StringCount.html}", note=[Accessed: 10-September-2024 ]}

BibLaTeX

@online{reference.wolfram_2024_stringcount, organization={Wolfram Research}, title={StringCount}, year={2020}, url={https://reference.wolfram.com/language/ref/StringCount.html}, note=[Accessed: 10-September-2024 ]}