Encoding and Decoding Data for Neural Networks

Neural networks in the Wolfram Language can interface with many types of data, including numerical, categorical, textual, image and audio. The functions NetEncoder and NetDecoder are used to automatically and efficiently translate non-numerical data to and from net-compatible NumericArray objects.

Encoders

NetEncoder convert images, categories, etc. to net-compatible numerical arrays

Audio Encoders

"Audio" encode audio as a sequence of waveform amplitudes

"AudioMelSpectrogram" encode audio as a mel spectrogram

"AudioMFCC" encode audio as a sequence of MFCC vectors

"AudioSpectrogram" encode audio as a spectrogram

"AudioSTFT" encode audio as a sequence of Fourier transforms

Text Encoders

"BPESubwordTokens" encode tokens in a string as a sequence of integer codes

"Characters" encode characters in a string as a sequence of integer codes or one-hot vectors

"Tokens" encode tokens in a string as a sequence of integer codes

"UTF8" encode strings as their UTF8 bytes

Image Encoders

"Image" encode a 2D image as a rank-3 array

"Image3D" encode a 3D image as a rank-4 array

Other Encoders

"Boolean" encode True and False as 1 and 0

"Class" encode a class label as an integer code or a one-hot vector

"Function" use a custom function to encode an input

Decoders

NetDecoder interpret net-generated numerical arrays as images, probabilities, etc.

Text Decoders

"BPESubwordTokens" decode probability vectors as a string of subword tokens

"Characters" decode probability vectors as a string of characters

"Tokens" decode probability vectors as a string of tokens

Image Decoders

"Image" decode a rank-3 array as a 2D image

"Image3D" decode a rank-4 array as a 3D image

Other Decoders

"Boolean" decode 1 and 0 as True and False

"Class" decode probability arrays as class labels

"CTCBeamSearch" decode sequences of probability vectors trained with a CTCLossLayer

"Function" decode using a custom function