Raw Character Encodings
always allows you to refer to special characters by using names such as \[Alpha]
or explicit hexadecimal codes such as \:03b1
. And when Mathematica
writes out files, it by default uses these names or hexadecimal codes.
But sometimes you may find it convenient to use raw encodings for at least some special characters. What this means is that rather than representing special characters by names or explicit hexadecimal codes, you instead represent them by raw bit patterns appropriate for a particular computer system or particular font.
Setting up raw character encodings.
When you press a key or combination of keys on your keyboard, the operating system of your computer sends a certain bit pattern to Mathematica
. How this bit pattern is interpreted as a character within Mathematica
will depend on the character encoding that has been set up.
The notebook front end for Mathematica
typically takes care of setting up the appropriate character encoding automatically for whatever font you are using. But if you use Mathematica
with a text-based interface or via files or pipes, then you may need to set $CharacterEncoding
By specifying an appropriate value for $CharacterEncoding
you will typically be able to get Mathematica
to handle raw text generated by whatever language-specific text editor or operating system you use.
You should realize, however, that while the standard representation of special characters used in Mathematica
is completely portable across different computer systems, any representation that involves raw character encodings will inevitably not be.
|"PrintableASCII"||printable ASCII characters only (default)|
|"ASCII"||all ASCII including control characters|
|"ISOLatin1"||characters for common western European languages|
|"ISOLatin2"||characters for central and eastern European languages|
|"ISOLatin3"||characters for additional European languages (e.g. Catalan, Turkish)|
|"ISOLatin4"||characters for other additional European languages (e.g. Estonian, Lappish)|
|"ISOLatinCyrillic"||English and Cyrillic characters|
|"AdobeStandard"||Adobe standard PostScript font encoding|
|"MacintoshRoman"||Macintosh roman font encoding|
|"WindowsANSI"||Windows standard font encoding|
|"Symbol"||symbol font encoding|
|"ZapfDingbats"||Zapf dingbats font encoding|
|"ShiftJIS"||shift-JIS for Japanese (mixture of 8- and 16-bit)|
|"EUC"||extended Unix code for Japanese (mixture of 8- and 16-bit)|
|"UTF8"||Unicode transformation format encoding|
|"Unicode"||raw 16-bit Unicode bit patterns|
Some raw character encodings supported by Mathematica.
knows about various raw character encodings, appropriate for different computer systems and different languages. Copying of characters between the Mathematica
notebook interface and user interface environment on your computer generally uses the native character encoding for that environment. Mathematica
characters which are not included in the native encoding will be written out using standard Mathematica
full names or hexadecimal codes.
kernel can use any character encoding you specify when it writes or reads text files. By default, Put
produce an ASCII representation for reliable portability of Mathematica
language files from one system to another.
This writes a string to the file tmp
Special characters are written out using full names or explicit hexadecimal codes.
supports both 8- and 16-bit raw character encodings. In an encoding such as "ISOLatin1"
, all characters are represented by bit patterns containing 8 bits. But in an encoding such as "ShiftJIS"
some characters instead involve bit patterns containing 16 bits.
Most of the raw character encodings supported by Mathematica
include basic ASCII as a subset. This means that even when you are using such encodings, you can still give ordinary Mathematica
input in the usual way, and you can specify special characters using \[
Some raw character encodings, however, do not include basic ASCII as a subset. An example is the "Symbol"
encoding, in which the character codes normally used for a
are instead used for
This gives the usual ASCII character codes for a few English letters.
In the "Symbol"
encoding, these character codes are used for Greek letters.
|ToCharacterCode["string"]||generate codes for characters using the standard Mathematica encoding|
|generate codes for characters using the specified encoding|
|generate characters from codes using the standard Mathematica encoding|
|generate characters from codes using the specified encoding|
Handling character codes with different encodings.
This gives the codes assigned to various characters by Mathematica
Here are the codes assigned to the same characters in the Macintosh roman encoding.
Here are the codes in the Windows standard encoding. There is no code for \[Pi]
in that encoding.
The character codes used internally by Mathematica
are based on Unicode. But externally Mathematica
by default always uses plain ASCII sequences such as \[Name]
to refer to special characters. By telling it to use the raw "Unicode"
character encoding, however, you can get Mathematica
to read and write characters in raw 16-bit Unicode form.