CTCLossLayer
represents a net layer that computes the connectionist temporal classification loss by comparing a sequence of class probability vectors with a sequence of indices representing the target classes.
Details and Options
- CTCLossLayer[] represents a net that takes an input matrix representing a sequence of vectors and a target vector representing a sequence of integers and outputs a real value.
- CTCLossLayer is typically used inside NetGraph.
- CTCLossLayer exposes the following ports for use in NetGraph etc.:
-
"Input" a sequence of probability vectors of size c+1 "Target" a sequence of integers between 1 and c "Output" a real number - The layer definition is based on Graves et al., "Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks", 2006.
- The input should be a sequence of probability vectors of size c+1 where each vector sums to 1. The last element of each vector represents the probability of a special blank class, with the remaining elements representing the probability of the indexed classes 1 to c. The target is a sequence of integers between 1 and c. The target sequence cannot be longer than the input sequence.
- CTCLossLayer[…][<"Input"in,"Target"target >] explicitly computes the output from applying the layer.
- CTCLossLayer[…][<"Input"->{in1,in2,…},"Target"->{target1,target2,…} >] explicitly computes outputs for each of the ini and targeti.
- When given a NumericArray as input, the output will be a NumericArray.
- The size of the input is usually inferred automatically within a NetGraph.
- CTCLossLayer[n,"Input"ishape,"Target"tshape] allows the shape of the input and target to be specified. Possible forms for ishape are:
-
NetEncoder[…] encoder producing a sequence of vectors {len,c+1} sequence of len length-(c+1) vectors {len,Automatic} sequence of len vectors whose length is inferred {"Varying",c+1} varying number of vectors each of length c+1 {"Varying",Automatic} varying number of vectors each of inferred length - Possible forms for tshape are:
-
NetEncoder[…] encoder producing a sequence of integers {len2} sequence of len2 integers {"Varying"} varying number of integers RepeatingElement[Restricted[Integer,c]] varying number of integers in the range 1 to c - Options[CTCLossLayer] gives the list of default options to construct the layer. Options[CTCLossLayer[…]] gives the list of default options to evaluate the layer on some data.
- Information[CTCLossLayer[…]] gives a report about the layer.
- Information[CTCLossLayer[…],prop] gives the value of the property prop of CTCLossLayer[…]. Possible properties are the same as for NetGraph.
Examples
open allclose allBasic Examples (2)
Create a CTCLossLayer object:
Create a CTCLossLayer where the input is a matrix whose rows are probability vectors and the target is a vector of indices:
Applications (1)
Train a net that classifies a vector of characters in an image. First generate training and test data, which consists of images of words and the corresponding word string:
Split the dataset into a test and a training set:
Take a RandomSample of the training set:
The decoder is a beam search decoder with a beam size of 50:
Define a net that takes an image and then treats the width dimension as a sequence dimension. A matrix whose rows are probability vectors over the width dimension is produced:
Define a CTCLossLayer with a character NetEncoder attached to the target port:
Train the net using the CTC loss:
Evaluate the trained net on images from the test set:
Obtain the top-5 decodings for an image, along with the negative log likelihood of each decoding:
Text
Wolfram Research (2018), CTCLossLayer, Wolfram Language function, https://reference.wolfram.com/language/ref/CTCLossLayer.html.
CMS
Wolfram Language. 2018. "CTCLossLayer." Wolfram Language & System Documentation Center. Wolfram Research. https://reference.wolfram.com/language/ref/CTCLossLayer.html.
APA
Wolfram Language. (2018). CTCLossLayer. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/CTCLossLayer.html