gives a list of the runs of characters identified as words in string.
gives the first n words in string.
- Characters in string that are not identified as being part of words are dropped by TextWords.
- TextWords[ContentObject[…]] gives words from the plain text contents of the ContentObject.
Examplesopen allclose all
Basic Examples (3)
Segment a string into a list of words:
TextWords separates words by punctuation as well as whitespace:
Get the first 10 words in a block of text:
TextWords preserves hyphenation:
Titles, currencies and other syntactic units are segmented as separate words:
Get a list of words from a ContentObject:
Make a WordCloud of words from a poem:
Properties & Relations (2)
TextWords is equivalent to TextCases[…,"Word"]:
TextStructure splits texts into the same words:
Possible Issues (1)
Words returned by TextWords are identified structurally, and may not be dictionary words:
Wolfram Research (2015), TextWords, Wolfram Language function, https://reference.wolfram.com/language/ref/TextWords.html (updated 2016).
Wolfram Language. 2015. "TextWords." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2016. https://reference.wolfram.com/language/ref/TextWords.html.
Wolfram Language. (2015). TextWords. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/TextWords.html