"Characters" (Net Encoder)

NetEncoder["Characters"]

represents an encoder that converts characters in an ASCII string to a sequence of integer codes.

NetEncoder[{"Characters",table}]

represents an encoder that converts characters in a string composed of characters in the list table.

NetEncoder[{"Characters",table,form}]

represents an encoder that converts characters in a string to the output type form.

NetEncoder[{"Characters",,"param"value,}]

represents an encoder in which additional parameters have been specified.

Details

  • NetEncoder[][input] applies the encoder to a string to produce an output.
  • NetEncoder[][{input1,input2,}] applies the encoder to a list of strings to produce a list of outputs.
  • The mapping from characters to codes specified by table can have the following forms:
  • "c1c2"map each character ci to successive available codes
    "c1c2"nmap all characters ci to code n
    "c1c2"Automaticmap all characters ci to the next available code
    n;;mspecmap characters between n and m to spec
    {spec1,spec2,}assign codes in sequence from the speci
  • The following symbolic character groups can be used in the table:
  • Automaticall printable ASCII characters, plus space, tab and newline
    LetterCharacterthe letters a through z and A through Z
    DigitCharacterthe digits 0 through 9
    WordCharacterthe union of LetterCharacter and DigitCharacter
    PunctuationCharacterall visible ASCII punctuation characters
    WhitespaceCharacterspace, tab and newline
    StartOfStringvirtual character that occurs before the beginning of the string
    EndOfStringvirtual character that occurs after the end of the string
    _any otherwise unassigned character
  • NetEncoder["Characters"] is suitable for typical English prose and consists of all printable ASCII characters, as well as tab, space and newline.
  • NetEncoder["Characters"] is equivalent to NetEncoder[{"Characters",{"\t","\n",FromCharacterCode[Range[32,126]]}}].
  • When form is "Index" (the default), the output of the encoder consists of integer codes corresponding to characters in the input string.
  • When form is "UnitVector", the output of the encoder consists of n-dimensional unit vectors, where the i^(th) vector is in the pi^(th) direction, where pi is the code corresponding to the i^(th) character.
  • An encoder can be attached to an input port of a net by specifying "port"->NetEncoder[] when constructing the net.
  • NetEncoder[{"Characters",}][["Alphabet"]] produces a list of the characters recognized by the encoder.
  • NetDecoder[NetEncoder[{"Characters",}]] produces a NetDecoder[{"Characters",}] with the same encoding as the given encoder.
  • Parameters
  • With the parameter "IgnoreCase"True, uppercase and lowercase letters will be encoded to the same value. The default value is "IgnoreCase"False.
  • With the default parameter setting "TargetLength"->All, all characters found in the input string are encoded.
  • With the parameter "TargetLength"->n, the first n tokens found in the input string are encoded, with padding applied if fewer than n tokens are found. If EndOfString is present in the token list, the padding value is the integer code associated with it; otherwise, the code associated with the last token is used.

Examples

open all close all

Basic Examples  (1)

Create a character encoder:

In[1]:=
Click for copyable input
Out[1]=

Encode a string of characters:

In[2]:=
Click for copyable input
Out[2]=

Scope  (7)

Properties & Relations  (2)