Character Operations

The Wolfram Language has efficient systemwide 32-bit Unicode support, allowing a full range of international, technical, and other character sets and character encodings.

Characters the list of characters in a string

StringJoin join characters to form a string

ToCharacterCode convert a string to a list of character codes

FromCharacterCode convert from a list of character codes to a string

CharacterRange give a range of characters with successive character codes

Character Types

DigitQ  ▪  LetterQ  ▪  UpperCaseQ  ▪  LowerCaseQ  ▪  PrintableASCIIQ

Letter Transformations »

ToUpperCase  ▪  ToLowerCase  ▪  RemoveDiacritics  ▪  Transliterate  ▪  CharacterNormalize  ▪  ...

Character Patterns »

LetterCharacter  ▪  DigitCharacter  ▪  WordCharacter  ▪  ...

Character Encoding

CharacterEncoding specify a character encoding ("ANSI", "UTF8", "ShiftJIS", )

$CharacterEncodings list of installed character encodings on a computer system

ToString, ToExpression convert between arbitrary expressions and strings

$CharacterEncoding  ▪  URLEncode  ▪  URLDecode  ▪  ByteArrayToString  ▪  StringToByteArray

ImportString  ▪  ExportString  ▪  ImportByteArray  ▪  ExportByteArray

CharacterName readable names for all Unicode characters

Alphabet Functions

Alphabet  ▪  LetterNumber  ▪  FromLetterNumber

CharacterCounts character and character n-gram counts

LetterCounts letter and letter n-gram counts