Wolfram Language & System Documentation Center

WordCounts

gives an association whose keys are the distinct words identified in string, and whose values give the number of times those words appear in string.

WordCounts["string",n]

gives counts of the distinct n-grams consisting of runs of n words in string.

WordCounts[{"string₁","string₂",…},…]

gives the counts for each of the string_i.

Details and Options

WordCounts[string,…] identifies words in string in the same way as TextWords.
In WordCounts[string,n], words that are considered part of an n-gram must appear consecutively in string, not separated by nonword characters other than whitespace.
WordCounts has the option IgnoreCase. With the setting IgnoreCase->True, letters are in effect all converted to lower case before being counted.

Examples

open all close all

Basic Examples (3)

Count the distinct words in a string:

Count the distinct 2-gram word sequences in a string:

Count the distinct words in each of a list of strings:

Scope (1)

Words can include digits and hyphens but not most punctuation:

Options (2)

IgnoreCase (2)

The default setting IgnoreCase->False treats uppercase and lowercase characters as distinct:

IgnoreCase->True treats words that differ only in case as the same:

Count n-grams regardless of case:

Applications (2)

Find the number of times the main characters Sherlock Holmes and John Watson are mentioned in some novels of Arthur Conan Doyle:

Visualize the results:

Retrieve Miguel Cervantes's novel Don Quixote from ExampleData to test the empirical Zipf law:

Generate the frequency table of all words in this text:

Zipf's law asserts that the frequency of a word versus its rank in the frequency table follows approximately a linear relation in a log-log scale. Test this statement on the first 1,000 most frequent words:

The result is close to . Visualize the fit together with the actual data:

Neat Examples (1)

Find the 20 most frequently occurring words in a body of text:

Do the same for 2-word sequences:

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

WordCounts

Details and Options

Examples

Basic Examples (3)

Scope (1)

Options (2)

IgnoreCase (2)

Applications (2)

Neat Examples (1)

Text

CMS

APA

BibTeX

BibLaTeX

WordCounts

Details and Options

Examples

Basic Examples (3)

Scope (1)

Options (2)

IgnoreCase (2)

Applications (2)

Neat Examples (1)

See Also

Related Guides

Related Workflows

History

Text

CMS

APA

BibTeX

BibLaTeX