Character Operations
The Wolfram Language has efficient systemwide 32-bit Unicode support, allowing a full range of international, technical, and other character sets and character encodings.
Characters — the list of characters in a string
StringJoin — join characters to form a string
ToCharacterCode — convert a string to a list of character codes
FromCharacterCode — convert from a list of character codes to a string
CharacterRange — give a range of characters with successive character codes
Character Types
DigitQ ▪ LetterQ ▪ UpperCaseQ ▪ LowerCaseQ ▪ PrintableASCIIQ
Letter Transformations »
ToUpperCase ▪ ToLowerCase ▪ RemoveDiacritics ▪ Transliterate ▪ CharacterNormalize ▪ ...
Character Patterns »
LetterCharacter ▪ DigitCharacter ▪ WordCharacter ▪ ...
Character Encoding
CharacterEncoding — specify a character encoding ("ANSI", "UTF8", "ShiftJIS", …)
$CharacterEncodings — list of installed character encodings on a computer system
ToString, ToExpression — convert between arbitrary expressions and strings
$CharacterEncoding ▪ URLEncode ▪ URLDecode ▪ ByteArrayToString ▪ StringToByteArray
ImportString ▪ ExportString ▪ ImportByteArray ▪ ExportByteArray
CharacterName — readable names for all Unicode characters
Alphabet Functions
Alphabet ▪ LetterNumber ▪ FromLetterNumber
CharacterCounts — character and character n-gram counts
LetterCounts — letter and letter n-gram counts