"Characters" (Net Encoder)


represents an encoder that converts characters in an ASCII string to a sequence of integer codes.


represents an encoder that converts characters in a string composed of characters in the list table.


represents an encoder that converts characters in a string to the output type form.


represents an encoder in which additional parameters have been specified.


  • NetEncoder[][input] applies the encoder to an input to produce an output.
  • NetEncoder[][{input1,input2,}] applies the encoder to a list of inputs to produce a list of outputs.
  • The input to the encoder inputi is a string.
  • The mapping from characters to codes specified by table can have the following forms:
  • "c1c2"map each character ci to successive available codes
    "c1c2"nmap all characters ci to code n
    "c1c2"Automaticmap all characters ci to the next available code
    n;;mspecmap characters between n and m to spec
    {spec1,spec2,}assign codes in sequence from the speci
  • The following symbolic character groups can be used in the table:
  • Automaticall printable ASCII characters, plus space, tab and newline
    LetterCharacterthe letters a through z and A through Z
    DigitCharacterthe digits 0 through 9
    WordCharacterthe union of LetterCharacter and DigitCharacter
    PunctuationCharacterall visible ASCII punctuation characters
    WhitespaceCharacterspace, tab and newline
    StartOfStringvirtual character that occurs before the beginning of the string
    EndOfStringvirtual character that occurs after the end of the string
    _any otherwise unassigned character
  • NetEncoder["Characters"] is suitable for typical English prose and consists of all printable ASCII characters, as well as tab, space and newline.
  • NetEncoder["Characters"] is equivalent to NetEncoder[{"Characters",{"\t","\n",FromCharacterCode[Range[32,126]]}}].
  • When form is "Index" (the default), the output of the encoder consists of integer codes corresponding to characters in the input string.
  • When form is "UnitVector", the output of the encoder consists of n-dimensional unit vectors, where the i^(th) vector is in the pi^(th) direction, where pi is the code corresponding to the i^(th) character.
  • An encoder can be attached to an input port of a net by specifying "port"->NetEncoder[] when constructing the net.
  • NetEncoder[{"Characters",}][["Alphabet"]] produces a list of the characters recognized by the encoder.
  • NetDecoder[NetEncoder[{"Characters",}]] produces a NetDecoder[{"Characters",}] with the same encoding as the given encoder.
  • Parameters
  • With the parameter "IgnoreCase"True, uppercase and lowercase letters will be encoded to the same value. The default value is "IgnoreCase"False.
  • With the default parameter setting "TargetLength"->All, all characters found in the input string are encoded.
  • With the parameter "TargetLength"->n, the first n characters found in the input string are encoded, with padding applied if fewer than n characters are found. The padding value is d+1, where d is the number of tokens in the vocabulary.


open allclose all

Basic Examples  (1)

Create a character encoder:

Click for copyable input

Encode a string of characters:

Click for copyable input

Scope  (7)

Properties & Relations  (2)

See Also

NetEncoder  NetDecoder  NetChain  NetGraph  ToCharacterCode  Characters


Related NetEncoders

Related NetDecoders