StringSplit
StringSplit["string"]
splits "string" into a list of substrings separated by whitespace.
StringSplit["string",patt]
splits into substrings separated by delimiters matching the string expression patt.
StringSplit["string",{p1,p2,…}]
splits at any of the pi.
StringSplit["string",pattval]
inserts val at the position of each delimiter.
StringSplit["string",{p1v1,…}]
inserts vi at the position of each delimiter pi.
StringSplit["string",patt,n]
splits into at most n substrings.
StringSplit[{s1,s2,…},p]
gives the list of results for each of the si.
Details and Options
- StringSplit[s] does not return the whitespace characters that delimit the substrings it returns.
- Whitespace includes any number of spaces, tabs, and newlines.
- The string expression patt can contain any of the objects specified in the notes for StringExpression.
- StringSplit[s] is equivalent to StringSplit[s,Whitespace].
- If s contains two adjacent delimiters, StringSplit considers there to be a zero‐length substring "" between them.
- StringSplit[s,patt] by default gives the list of substrings of s that occur between delimiters defined by patt; it does not include the delimiters themselves.
- StringSplit[s,patt->val] includes val at the position of each delimiter.
- StringSplit[s,patt:>val] evaluates val only when the pattern is found.
- StringSplit["string",{p1->v1,…,pa,…}] includes v1 at the position of delimiters matching p1, but omits delimiters matching pa.
- By default, StringSplit[s,patt] drops zero‐length substrings associated with delimiters that appear at the beginning or end of s.
- StringSplit[s,patt,All] returns all substrings, including zero‐length ones at the beginning or end.
- Setting the option IgnoreCase->True makes StringSplit treat lowercase and uppercase letters as equivalent.
- StringSplit["string",RegularExpression["regex"]] splits at delimiters matching the specified regular expression.
- StringSplit[BioSequence["type","seq"],patt,…] will split the string "seq" by patt yielding a list of biomolecular sequences. In this case, degenerate letters in patt are interpreted as wildcard patterns based on the type of biomolecular sequence. Use Verbatim["patt"] to match degenerate letters literally.
- The documentation for BioSequence lists the degenerate letters supported by each type of biomolecular sequence.
Examples
open allclose allBasic Examples (2)
Scope (11)
Mixed regular expressions and string patterns:
Split into substrings separated by either delimiter:
Insert a value at the position of a delimiter:
Include the delimiters in the output:
StringSplit automatically threads over lists of strings:
Split a DNA sequence by a particular substring:
Use a wildcard in the pattern to split the biomolecular sequence:
The "N" is a degenerate letter only in biomolecular sequences:
Split only on literal degenerate letters using Verbatim:
Generalizations & Extensions (1)
Applications (4)
Make a nested array by applying StringSplit twice:
Sequences with adenine symmetrically placed:
Text analysis with some right and left context:
Use StringSplit to find all occurrences of the word "power":
Compute part of the left and right contexts in which each word occurs:
List extensions of files in a directory and its subdirectories:
Properties & Relations (4)
Splitting at whitespace is equivalent to cases of non-whitespace sequences:
StringSplit with a rule is equivalent to StringReplace:
A null delimiter splits at every character:
Using StringSplit on a comma-separated values string:
In many cases Import and ImportString provide direct functionality:
Possible Issues (1)
StringSplit by default splits only at whitespace:
Text
Wolfram Research (2004), StringSplit, Wolfram Language function, https://reference.wolfram.com/language/ref/StringSplit.html (updated 2020).
CMS
Wolfram Language. 2004. "StringSplit." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2020. https://reference.wolfram.com/language/ref/StringSplit.html.
APA
Wolfram Language. (2004). StringSplit. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/StringSplit.html