Raw Character Encodings
Mathematica always allows you to refer to special characters by using names such as

or explicit hexadecimal codes such as

. And when
Mathematica writes out files, it by default uses these names or hexadecimal codes.
But sometimes you may find it convenient to use raw encodings for at least some special characters. What this means is that rather than representing special characters by names or explicit hexadecimal codes, you instead represent them by raw bit patterns appropriate for a particular computer system or particular font.
Setting up raw character encodings.
When you press a key or combination of keys on your keyboard, the operating system of your computer sends a certain bit pattern to
Mathematica. How this bit pattern is interpreted as a character within
Mathematica will depend on the character encoding that has been set up.
The notebook front end for
Mathematica typically takes care of setting up the appropriate character encoding automatically for whatever font you are using. But if you use
Mathematica with a text-based interface or via files or pipes, then you may need to set
$CharacterEncoding explicitly.
By specifying an appropriate value for
$CharacterEncoding you will typically be able to get
Mathematica to handle raw text generated by whatever language-specific text editor or operating system you use.
You should realize, however, that while the standard representation of special characters used in
Mathematica is completely portable across different computer systems, any representation that involves raw character encodings will inevitably not be.
| "PrintableASCII" | printable ASCII characters only |
| "ASCII" | all ASCII including control characters |
| "ISOLatin1" | characters for common western European languages |
| "ISOLatin2" | characters for central and eastern European languages |
| "ISOLatin3" | characters for additional European languages (e.g. Catalan, Turkish) |
| "ISOLatin4" | characters for other additional European languages (e.g. Estonian, Lappish) |
| "ISOLatinCyrillic" | English and Cyrillic characters |
| "AdobeStandard" | Adobe standard PostScript font encoding |
| "MacintoshRoman" | Macintosh roman font encoding |
| "WindowsANSI" | Windows standard font encoding |
| "Symbol" | symbol font encoding |
| "ZapfDingbats" | Zapf dingbats font encoding |
| "ShiftJIS" | shift-JIS for Japanese (mixture of 8- and 16-bit) |
| "EUC" | extended Unix code for Japanese (mixture of 8- and 16-bit) |
| "UTF8" | Unicode transformation format encoding |
| "Unicode" | raw 16-bit Unicode bit patterns |
Some raw character encodings supported by Mathematica.
Mathematica knows about various raw character encodings, appropriate for different computer systems and different languages. Copying of characters between the
Mathematica notebook interface and user interface environment on your computer generally uses the native character encoding for that environment.
Mathematica characters which are not included in the native encoding will be written out using standard
Mathematica full names or hexadecimal codes.
The
Mathematica kernel can use any character encoding you specify when it writes or reads text files. By default,
Put and
PutAppend produce an ASCII representation for reliable portability of
Mathematica language files from one system to another.
This writes a string to the file

.
Special characters are written out using full names or explicit hexadecimal codes.
| Out[2]= |  |
Mathematica supports both 8- and 16-bit raw character encodings. In an encoding such as

, all characters are represented by bit patterns containing 8 bits. But in an encoding such as

some characters instead involve bit patterns containing 16 bits.
Most of the raw character encodings supported by
Mathematica include basic ASCII as a subset. This means that even when you are using such encodings, you can still give ordinary
Mathematica input in the usual way, and you can specify special characters using

and

sequences.
Some raw character encodings, however, do not include basic ASCII as a subset. An example is the

encoding, in which the character codes normally used for

and

are instead used for

and

.
This gives the usual ASCII character codes for a few English letters.
| Out[3]= |  |
In the

encoding, these character codes are used for Greek letters.
| Out[4]= |  |
| ToCharacterCode["string"] | generate codes for characters using the standard Mathematica encoding |
| ToCharacterCode["string","encoding"] | generate codes for characters using the specified encoding |
| FromCharacterCode[{n1,n2,...}] | generate characters from codes using the standard Mathematica encoding |
| FromCharacterCode[{n1,n2,...},"encoding"] |
| generate characters from codes using the specified encoding |
Handling character codes with different encodings.
This gives the codes assigned to various characters by
Mathematica.
| Out[5]= |  |
Here are the codes assigned to the same characters in the Macintosh roman encoding.
| Out[6]= |  |
Here are the codes in the Windows standard encoding. There is no code for
\[Pi] in that encoding.
| Out[7]= |  |
The character codes used internally by
Mathematica are based on Unicode. But externally
Mathematica by default always uses plain ASCII sequences such as

or

to refer to special characters. By telling it to use the raw

character encoding, however, you can get
Mathematica to read and write characters in raw 16-bit Unicode form.