Wolfram Language & System Documentation Center

BatchNormalizationLayer

represents a trainable net layer that normalizes its input data by learning the data mean and variance.

Details and Options

BatchNormalizationLayer is typically used inside NetChain, NetGraph, etc. to regularize and speed up network training.
The following optional parameters can be included:
"Epsilon" 0.001` stability parameter

Interleaving False the position of the channel dimension

"Momentum" 0.9 momentum used during training
With the setting InterleavingFalse, the channel dimension is taken to be the first dimension of the input and output arrays.
With the setting InterleavingTrue, the channel dimension is taken to be the last dimension of the input and output arrays.
The following learnable arrays can be included:

"Biases"	Automatic	learnable bias array
"MovingMean"	Automatic	moving estimate of the mean
"MovingVariance"	Automatic	moving estimate of the variance
"Scaling"	Automatic	learnable scaling array

With Automatic settings, the biases, scaling, moving mean and moving variance arrays are initialized automatically when NetInitialize or NetTrain is used.
The following training parameter can be included:
LearningRateMultipliers Automatic learning rate multipliers for the arrays
BatchNormalizationLayer freezes the values of "MovingVariance" and "MovingMean" during training with NetTrain if LearningRateMultipliers is 0 or "Momentum" is 1.
If biases, scaling, moving variance and moving mean have been set, BatchNormalizationLayer[…][input] explicitly computes the output from applying the layer.
BatchNormalizationLayer[…][{input₁,input₂,…}] explicitly computes outputs for each of the input_i.
When given a NumericArray as input, the output will be a NumericArray.
BatchNormalizationLayer exposes the following ports for use in NetGraph etc.:
"Input" a vector, matrix or higher-rank array

"Output" a vector, matrix or higher-rank array
When it cannot be inferred from other layers in a larger net, the option "Input"->{n₁,n₂,…} can be used to fix the input dimensions of BatchNormalizationLayer.
NetExtract can be used to extract biases, scaling, moving variance and moving mean arrays from a BatchNormalizationLayer object.
Options[BatchNormalizationLayer] gives the list of default options to construct the layer. Options[BatchNormalizationLayer[…]] gives the list of default options to evaluate the layer on some data.
Information[BatchNormalizationLayer[…]] gives a report about the layer.
Information[BatchNormalizationLayer[…],prop] gives the value of the property prop of BatchNormalizationLayer[…]. Possible properties are the same as for NetGraph.

Examples

open all close all

Basic Examples (2)

Create a BatchNormalizationLayer:

Create an initialized BatchNormalizationLayer that takes a vector and returns a vector:

Apply the layer to an input vector:

Scope (4)

Ports (2)

Create an initialized BatchNormalizationLayer that takes a rank-3 array and returns a rank-3 array:

Create an initialized BatchNormalizationLayer that takes a vector and returns a vector:

Apply the layer to a batch of input vectors:

Use NetEvaluationMode to use the training behavior of BatchNormalizationLayer:

Parameters (2)

"Biases" (1)

Create a BatchNormalizationLayer with an initial value for the "Biases" parameter:

Extract the "Biases" parameter:

The default value for "Biases" chosen by NetInitialize is a zero vector:

"Scaling" (1)

Create an initialized BatchNormalizationLayer with the "Scaling" parameter set to zero and the "Biases" parameter set to a custom value:

Applying the layer to any input returns the value for the "Biases" parameter:

The default value for "Scaling" chosen by NetInitialize is a vector of 1s:

Options (2)

"Epsilon" (1)

Create a BatchNormalizationLayer with the "Epsilon" parameter explicitly specified:

Extract the "Epsilon" parameter:

"Momentum" (1)

Create a BatchNormalizationLayer with the "Momentum" parameter explicitly specified:

Extract the "Momentum" parameter:

Applications (1)

BatchNormalizationLayer is commonly inserted between a ConvolutionLayer and its activation function in order to stabilize and speed up training:

Properties & Relations (1)

During ordinary evaluation, BatchNormalizationLayer computes the following function:

Evaluate a BatchNormalizationLayer on an example vector containing a single channel:

Manually compute the same result:

Possible Issues (3)

Specifying negative values for the "MovingVariance" parameter causes numerical errors during evaluation:

BatchNormalizationLayer cannot be initialized until all its input and output dimensions are known:

The "MovingMean" and "MovingVariance" arrays of BatchNormalizationLayer cannot be shared:

Create a BatchNormalizationLayer with shared arrays:

Train it on some data:

Extract the trained batch normalization layers:

The "Scaling" and "Biases" arrays were shared, but not "MovingMean" or "MovingVariance":

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

BatchNormalizationLayer

Details and Options

Examples

Basic Examples (2)