"CTCBeamSearch" (Net Decoder)


represents a decoder that interprets a sequence of probability vectors and gives the most likely sequence decoding.


represents a decoder with specified beam size.


  • NetDecoder[][input] applies the decoder to an input to produce an output.
  • NetDecoder[][{input1,input2,}] applies the decoder to a list of inputs to produce a list of outputs.
  • The input tensor to the "CTCBeamSearch" decoder must be a sequence of vectors, each of size n+1, where n is the size of the alphabet. The last element of each vector represents the special blank class.
  • The output of "CTCBeamSearch" is a sequence of elements from the alphabet whose maximum length is equal to the length of the input sequence. Fewer elements are typically returned.
  • A decoder can be attached to an output port of a net by specifying "port"->NetDecoder[] when constructing the net.
  • Parameters
  • With the parameter "BeamSize"->n, the "CTCBeamSearch" decoder will maintain a set of n candidate decodings during processing. The default is 100.
  • A "BeamSize" of 1 is equivalent to greedy search, where top probability is chosen at each element in the sequence.
  • Properties
  • NetDecoder[][data,prop] can be used to calculate a specific property for the input data.
  • When a "CTCBeamSearch" decoder is attached to a net, net[data,prop] or net[data,"oport"->prop] can be used to calculate a specific property of the decoded output.
  • The "CTCBeamSearch" decoder supports the following properties:
  • "Decoding"the single most probable sequence found (default)
    "Decodings"all the most probable sequences found
    {"TopDecodings",n}gives the n most probable sequences
    "NegativeLogLikelihoods"gives the negative log likelihood of each decoding, returned as a list of rules
    {"TopNegativeLogLikelihoods",n}gives the negative log likelihoods of the top n decodings


Basic Examples  (1)

Create a CTC beam search decoder:

Click for copyable input

Use the decoder on a sequence of probability vectors:

Click for copyable input

Obtain the top three decoded sequences and their negative log-likelihoods:

Click for copyable input

See Also

NetDecoder  CTCLossLayer  NetEncoder  NetChain  NetGraph


Related NetEncoders

Related NetDecoders