BatchNormalizationLayer

BatchNormalizationLayer[]

represents a trainable net layer that normalizes its input data by learning the data mean and variance.

Details and Options

  • BatchNormalizationLayer is typically used inside NetChain, NetGraph, etc. to regularize and speed up network training.
  • The following optional parameters can be included:
  • "Epsilon"0.001`stability parameter
    InterleavingFalsethe position of the channel dimension
    "Momentum"0.9momentum used during training
  • With the setting InterleavingFalse, the channel dimension is taken to be the first dimension of the input and output arrays.
  • With the setting InterleavingTrue, the channel dimension is taken to be the last dimension of the input and output arrays.
  • The following learnable arrays can be included:
  • "Biases"Automaticlearnable bias array
    "MovingMean"Automaticmoving estimate of the mean
    "MovingVariance"Automaticmoving estimate of the variance
    "Scaling"Automaticlearnable scaling array
  • With Automatic settings, the biases, scaling, moving mean and moving variance arrays are initialized automatically when NetInitialize or NetTrain is used.
  • The following training parameter can be included:
  • LearningRateMultipliersAutomaticlearning rate multipliers for the arrays
  • BatchNormalizationLayer freezes the values of "MovingVariance" and "MovingMean" during training with NetTrain if LearningRateMultipliers is 0 or "Momentum" is 1.
  • If biases, scaling, moving variance and moving mean have been set, BatchNormalizationLayer[][input] explicitly computes the output from applying the layer.
  • BatchNormalizationLayer[][{input1,input2,}] explicitly computes outputs for each of the inputi.
  • When given a NumericArray as input, the output will be a NumericArray.
  • BatchNormalizationLayer exposes the following ports for use in NetGraph etc.:
  • "Input"a vector, matrix or higher-rank array
    "Output"a vector, matrix or higher-rank array
  • When it cannot be inferred from other layers in a larger net, the option "Input"->{n1,n2,} can be used to fix the input dimensions of BatchNormalizationLayer.
  • NetExtract can be used to extract biases, scaling, moving variance and moving mean arrays from a BatchNormalizationLayer object.
  • Options[BatchNormalizationLayer] gives the list of default options to construct the layer. Options[BatchNormalizationLayer[]] gives the list of default options to evaluate the layer on some data.
  • Information[BatchNormalizationLayer[]] gives a report about the layer.
  • Information[BatchNormalizationLayer[],prop] gives the value of the property prop of BatchNormalizationLayer[]. Possible properties are the same as for NetGraph.

Examples

open allclose all

Basic Examples  (2)

Create a BatchNormalizationLayer:

Create an initialized BatchNormalizationLayer that takes a vector and returns a vector:

Apply the layer to an input vector:

Scope  (2)

Create an initialized BatchNormalizationLayer that takes a rank-3 array and returns a rank-3 array:

Create an initialized BatchNormalizationLayer that takes a vector and returns a vector:

Apply the layer to a batch of input vectors:

Use NetEvaluationMode to use the training behavior of BatchNormalizationLayer:

Options  (4)

"Biases"  (1)

Create a BatchNormalizationLayer with an initial value for the "Biases" parameter:

Extract the "Biases" parameter:

The default value for "Biases" chosen by NetInitialize is a zero vector:

"Epsilon"  (1)

Create a BatchNormalizationLayer with the "Epsilon" parameter explicitly specified:

Extract the "Epsilon" parameter:

"Momentum"  (1)

Create a BatchNormalizationLayer with the "Momentum" parameter explicitly specified:

Extract the "Momentum" parameter:

"Scaling"  (1)

Create an initialized BatchNormalizationLayer with the "Scaling" parameter set to zero and the "Biases" parameter set to a custom value:

Applying the layer to any input returns the value for the "Biases" parameter:

The default value for "Scaling" chosen by NetInitialize is a vector of 1s:

Applications  (1)

BatchNormalizationLayer is commonly inserted between a ConvolutionLayer and its activation function in order to stabilize and speed up training:

Properties & Relations  (1)

During ordinary evaluation, BatchNormalizationLayer computes the following function:

Evaluate a BatchNormalizationLayer on an example vector containing a single channel:

Manually compute the same result:

Possible Issues  (3)

Specifying negative values for the "MovingVariance" parameter causes numerical errors during evaluation:

BatchNormalizationLayer cannot be initialized until all its input and output dimensions are known:

The "MovingMean" and "MovingVariance" arrays of BatchNormalizationLayer cannot be shared:

Create a BatchNormalizationLayer with shared arrays:

Train it on some data:

Extract the trained batch normalization layers:

The "Scaling" and "Biases" arrays were shared, but not "MovingMean" or "MovingVariance":

Introduced in 2016
 (11.0)
 |
Updated in 2020
 (12.1)