---
title: "BatchNormalizationLayer"
language: "en"
type: "Symbol"
summary: "BatchNormalizationLayer[] represents a trainable net layer that normalizes its input data by learning the data mean and variance."
keywords: 
- batch normalization
- regularization
- regularizing neural nets
- batch norm
canonical_url: "https://reference.wolfram.com/language/ref/BatchNormalizationLayer.html"
source: "Wolfram Language Documentation"
related_guides: 
  - 
    title: "Neural Network Layers"
    link: "https://reference.wolfram.com/language/guide/NeuralNetworkLayers.en.md"
related_functions: 
  - 
    title: "DropoutLayer"
    link: "https://reference.wolfram.com/language/ref/DropoutLayer.en.md"
  - 
    title: "NetEvaluationMode"
    link: "https://reference.wolfram.com/language/ref/NetEvaluationMode.en.md"
  - 
    title: "ConvolutionLayer"
    link: "https://reference.wolfram.com/language/ref/ConvolutionLayer.en.md"
  - 
    title: "PoolingLayer"
    link: "https://reference.wolfram.com/language/ref/PoolingLayer.en.md"
  - 
    title: "NormalizationLayer"
    link: "https://reference.wolfram.com/language/ref/NormalizationLayer.en.md"
  - 
    title: "LocalResponseNormalizationLayer"
    link: "https://reference.wolfram.com/language/ref/LocalResponseNormalizationLayer.en.md"
  - 
    title: "NetChain"
    link: "https://reference.wolfram.com/language/ref/NetChain.en.md"
  - 
    title: "NetGraph"
    link: "https://reference.wolfram.com/language/ref/NetGraph.en.md"
  - 
    title: "NetInitialize"
    link: "https://reference.wolfram.com/language/ref/NetInitialize.en.md"
  - 
    title: "NetTrain"
    link: "https://reference.wolfram.com/language/ref/NetTrain.en.md"
  - 
    title: "NetExtract"
    link: "https://reference.wolfram.com/language/ref/NetExtract.en.md"
related_tutorials: 
  - 
    title: "Neural Networks in the Wolfram Language"
    link: "https://reference.wolfram.com/language/tutorial/NeuralNetworksOverview.en.md"
---
[EXPERIMENTAL]

# BatchNormalizationLayer

BatchNormalizationLayer[] represents a trainable net layer that normalizes its input data by learning the data mean and variance.

## Details and Options

* ``BatchNormalizationLayer`` is typically used inside ``NetChain``, ``NetGraph``, etc. to regularize and speed up network training.

* The following optional parameters can be included:

|              |       |                                       |
| ------------ | ----- | ------------------------------------- |
| "Epsilon"    | 0.001 | stability parameter                   |
| Interleaving | False | the position of the channel dimension |
| "Momentum"   | 0.9   | momentum used during training         |

* With the setting ``Interleaving -> False``, the channel dimension is taken to be the first dimension of the input and output arrays.

* With the setting ``Interleaving -> True``, the channel dimension is taken to be the last dimension of the input and output arrays.

* The following learnable arrays can be included:

|                  |           |                                 |
| ---------------- | --------- | ------------------------------- |
| "Biases"         | Automatic | learnable bias array            |
| "MovingMean"     | Automatic | moving estimate of the mean     |
| "MovingVariance" | Automatic | moving estimate of the variance |
| "Scaling"        | Automatic | learnable scaling array         |

* With ``Automatic`` settings, the biases, scaling, moving mean and moving variance arrays are initialized automatically when ``NetInitialize`` or ``NetTrain`` is used.

* The following training parameter can be included:

[`LearningRateMultipliers`](https://reference.wolfram.com/language/ref/LearningRateMultipliers.en.md)	[`Automatic`](https://reference.wolfram.com/language/ref/Automatic.en.md)	learning rate multipliers for the arrays

* ``BatchNormalizationLayer`` freezes the values of ``"MovingVariance"`` and ``"MovingMean"`` during training with ``NetTrain`` if ``LearningRateMultipliers`` is 0 or ``"Momentum"`` is 1.

* If biases, scaling, moving variance and moving mean have been set, ``BatchNormalizationLayer[…][input]`` explicitly computes the output from applying the layer.

* ``BatchNormalizationLayer[…][{input1, input2, …}]`` explicitly computes outputs for each of the ``inputi``.

* When given a ``NumericArray`` as input, the output will be a ``NumericArray``.

* ``BatchNormalizationLayer`` exposes the following ports for use in ``NetGraph`` etc.:

|          |                                       |
| -------- | ------------------------------------- |
| "Input"  | a vector, matrix or higher-rank array |
| "Output" | a vector, matrix or higher-rank array |

* When it cannot be inferred from other layers in a larger net, the option ``"Input" -> {n1, n2, …}`` can be used to fix the input dimensions of ``BatchNormalizationLayer``.

* ``NetExtract`` can be used to extract biases, scaling, moving variance and moving mean arrays from a ``BatchNormalizationLayer`` object.

* ``Options[BatchNormalizationLayer]`` gives the list of default options to construct the layer. ``Options[BatchNormalizationLayer[…]]`` gives the list of default options to evaluate the layer on some data.

* ``Information[BatchNormalizationLayer[…]]`` gives a report about the layer.

* ``Information[BatchNormalizationLayer[…], prop]`` gives the value of the property ``prop`` of ``BatchNormalizationLayer[…]``. [Possible properties](https://reference.wolfram.com/language/ref/NetGraph.en.md#495200340) are the same as for ``NetGraph``.

---

## Examples (13)

### Basic Examples (2)

Create a ``BatchNormalizationLayer`` :

```wl
In[1]:= BatchNormalizationLayer[]

Out[1]=
BatchNormalizationLayer[Association["Type" -> "BatchNormalization", 
  "Arrays" -> Association["Scaling" -> NeuralNetworks`TensorT[NeuralNetworks`ListT[1, 
       NeuralNetworks`SizeT], NeuralNetworks`RealT], 
    "Biases" -> NeuralNetworks`TensorT ... Output" -> NeuralNetworks`TensorT[{NeuralNetworks`SizeT}, 
      NeuralNetworks`TensorT[NeuralNetworks`ListT[NeuralNetworks`NaturalT, NeuralNetworks`SizeT], 
       NeuralNetworks`RealT]]]], Association["Version" -> "14.1.1", "Unstable" -> False]]
```

---

Create an initialized ``BatchNormalizationLayer`` that takes a vector and returns a vector:

```wl
In[1]:= batchnorm = NetInitialize@BatchNormalizationLayer["Input" -> 3]

Out[1]=
BatchNormalizationLayer[Association["Type" -> "BatchNormalization", 
  "Arrays" -> Association["Scaling" -> RawArray["Real32", {1., 1., 1.}], 
    "Biases" -> RawArray["Real32", {0., 0., 0.}], "MovingMean" -> RawArray["Real32", {0., 0., 0.}], 
     ...  {}], 
  "Inputs" -> Association["Input" -> NeuralNetworks`TensorT[{3}, NeuralNetworks`AtomT]], 
  "Outputs" -> Association["Output" -> NeuralNetworks`TensorT[{3}, NeuralNetworks`RealT]]], 
 Association["Version" -> "14.1.1", "Unstable" -> False]]
```

Apply the layer to an input vector:

```wl
In[2]:= batchnorm[{1, 2, 3}]

Out[2]= {0.9995, 1.999, 2.9985}
```

### Scope (4)

#### Ports (2)

Create an initialized ``BatchNormalizationLayer`` that takes a rank-3 array and returns a rank-3 array:

```wl
In[1]:= batchnorm = NetInitialize@BatchNormalizationLayer["Input" -> {2, 3, 3}]

Out[1]=
BatchNormalizationLayer[Association["Type" -> "BatchNormalization", 
  "Arrays" -> Association["Scaling" -> RawArray["Real32", {1., 1.}], 
    "Biases" -> RawArray["Real32", {0., 0.}], "MovingMean" -> RawArray["Real32", {0., 0.}], 
    "MovingVaria ... puts" -> Association["Input" -> NeuralNetworks`TensorT[{2, 3, 3}, NeuralNetworks`AtomT]], 
  "Outputs" -> Association["Output" -> NeuralNetworks`TensorT[{2, 3, 3}, NeuralNetworks`RealT]]], 
 Association["Version" -> "14.1.1", "Unstable" -> False]]

In[2]:= batchnorm[RandomReal[1, {2, 3, 3}]]//Normal//MatrixForm

Out[2]//MatrixForm= (⁠|                                                                                     |                                                                                     |                                                                          ...         \| \| --------- \| \| 0.176011  \| \| 0.0562637 \| \| 0.441445  \|⁠) | (⁠\|          \| \| -------- \| \| 0.33644  \| \| 0.258115 \| \| 0.399725 \|⁠)      | (⁠\|          \| \| -------- \| \| 0.490502 \| \| 0.236015 \| \| 0.108276 \|⁠) |⁠)
```

---

Create an initialized ``BatchNormalizationLayer`` that takes a vector and returns a vector:

```wl
In[1]:= batchnorm = NetInitialize@BatchNormalizationLayer["Input" -> 3]

Out[1]=
BatchNormalizationLayer[Association["Type" -> "BatchNormalization", 
  "Arrays" -> Association["Scaling" -> RawArray["Real32", {1., 1., 1.}], 
    "Biases" -> RawArray["Real32", {0., 0., 0.}], "MovingMean" -> RawArray["Real32", {0., 0., 0.}], 
     ...  {}], 
  "Inputs" -> Association["Input" -> NeuralNetworks`TensorT[{3}, NeuralNetworks`AtomT]], 
  "Outputs" -> Association["Output" -> NeuralNetworks`TensorT[{3}, NeuralNetworks`RealT]]], 
 Association["Version" -> "14.1.1", "Unstable" -> False]]
```

Apply the layer to a batch of input vectors:

```wl
In[2]:= batchnorm[{{1, 2, 3}, {4, 0.2, 3}}]

Out[2]= {{0.9995, 1.999, 2.9985}, {3.998, 0.1999, 2.9985}}
```

Use ``NetEvaluationMode`` to use the training behavior of ``BatchNormalizationLayer`` :

```wl
In[3]:= batchnorm[{{1, 2, 3}, {4, 0.2, 3}}, NetEvaluationMode -> "Train"]

Out[3]= {{-0.999778, 0.999383, 0.}, {0.999778, -0.999383, 0.}}
```

#### Parameters (2)

##### "Biases" (1)

Create a ``BatchNormalizationLayer`` with an initial value for the ``"Biases"`` parameter:

```wl
In[1]:= batchnorm = BatchNormalizationLayer["Biases" -> {-1, 3.4}]

Out[1]=
BatchNormalizationLayer[Association["Type" -> "BatchNormalization", 
  "Arrays" -> Association["Scaling" -> NeuralNetworks`TensorT[{2}, NeuralNetworks`RealT], 
    "Biases" -> RawArray["Real32", {-1., 3.4000000953674316}], 
    "MovingMean" -> Neur ...  
   Association["Output" -> NeuralNetworks`TensorT[{2}, NeuralNetworks`TensorT[
       NeuralNetworks`ListT[NeuralNetworks`NaturalT, NeuralNetworks`SizeT], 
       NeuralNetworks`RealT]]]], Association["Version" -> "14.1.1", "Unstable" -> False]]
```

Extract the ``"Biases"`` parameter:

```wl
In[2]:= NetExtract[batchnorm, "Biases"]

Out[2]= RawArray["Real32", {-1., 3.4000000953674316}]
```

The default value for ``"Biases"`` chosen by ``NetInitialize`` is a zero vector:

```wl
In[3]:=
batchnorm = NetInitialize@BatchNormalizationLayer["Input" -> 2];
NetExtract[batchnorm, "Biases"]

Out[4]= RawArray["Real32", {0., 0.}]
```

##### "Scaling" (1)

Create an initialized ``BatchNormalizationLayer`` with the ``"Scaling"`` parameter set to zero and the ``"Biases"`` parameter set to a custom value:

```wl
In[1]:= batchnorm = NetInitialize@BatchNormalizationLayer["Scaling" -> {0, 0, 0}, "Biases" -> {1.3, -22.1, 1.2}]

Out[1]=
BatchNormalizationLayer[Association["Type" -> "BatchNormalization", 
  "Arrays" -> Association["Scaling" -> RawArray["Real32", {0., 0., 0.}], 
    "Biases" -> RawArray["Real32", {1.2999999523162842, -22.100000381469727, 1.2000000476837158}], 
    " ...  
   Association["Output" -> NeuralNetworks`TensorT[{3}, NeuralNetworks`TensorT[
       NeuralNetworks`ListT[NeuralNetworks`NaturalT, NeuralNetworks`SizeT], 
       NeuralNetworks`RealT]]]], Association["Version" -> "14.1.1", "Unstable" -> False]]
```

Applying the layer to any input returns the value for the ``"Biases"`` parameter:

```wl
In[2]:= batchnorm[{1, 2, 3}]

Out[2]= {1.3, -22.1, 1.2}

In[3]:= batchnorm[{-3.4, 2.3, 100}]

Out[3]= {1.3, -22.1, 1.2}
```

The default value for ``"Scaling"`` chosen by ``NetInitialize`` is a vector of 1s:

```wl
In[4]:=
batchnorm = NetInitialize@BatchNormalizationLayer["Input" -> 2];
NetExtract[batchnorm, "Scaling"]

Out[5]= RawArray["Real32", {1., 1.}]
```

---

### Options (2)

#### "Epsilon" (1)

Create a ``BatchNormalizationLayer`` with the ``"Epsilon"`` parameter explicitly specified:

```wl
In[1]:= batchnorm = BatchNormalizationLayer["Epsilon" -> 0.1]

Out[1]=
BatchNormalizationLayer[Association["Type" -> "BatchNormalization", 
  "Arrays" -> Association["Scaling" -> NeuralNetworks`TensorT[NeuralNetworks`ListT[1, 
       NeuralNetworks`SizeT], NeuralNetworks`RealT], 
    "Biases" -> NeuralNetworks`TensorT ... Output" -> NeuralNetworks`TensorT[{NeuralNetworks`SizeT}, 
      NeuralNetworks`TensorT[NeuralNetworks`ListT[NeuralNetworks`NaturalT, NeuralNetworks`SizeT], 
       NeuralNetworks`RealT]]]], Association["Version" -> "14.1.1", "Unstable" -> False]]
```

Extract the ``"Epsilon"`` parameter:

```wl
In[2]:= NetExtract[batchnorm, "Epsilon"]

Out[2]= 0.1
```

#### "Momentum" (1)

Create a ``BatchNormalizationLayer`` with the ``"Momentum"`` parameter explicitly specified:

```wl
In[1]:= batchnorm = BatchNormalizationLayer["Momentum" -> 0.1]

Out[1]=
BatchNormalizationLayer[Association["Type" -> "BatchNormalization", 
  "Arrays" -> Association["Scaling" -> NeuralNetworks`TensorT[NeuralNetworks`ListT[1, 
       NeuralNetworks`SizeT], NeuralNetworks`RealT], 
    "Biases" -> NeuralNetworks`TensorT ... Output" -> NeuralNetworks`TensorT[{NeuralNetworks`SizeT}, 
      NeuralNetworks`TensorT[NeuralNetworks`ListT[NeuralNetworks`NaturalT, NeuralNetworks`SizeT], 
       NeuralNetworks`RealT]]]], Association["Version" -> "14.1.1", "Unstable" -> False]]
```

Extract the ``"Momentum"`` parameter:

```wl
In[2]:= NetExtract[batchnorm, "Momentum"]

Out[2]= 0.1
```

### Applications (1)

``BatchNormalizationLayer`` is commonly inserted between a ``ConvolutionLayer`` and its activation function in order to stabilize and speed up training:

```wl
In[1]:= NetChain[{ConvolutionLayer[3, {3, 3}], BatchNormalizationLayer["Input" -> {3, 28, 28}], ElementwiseLayer[Ramp]}]

Out[1]=
NetChain[Association["Type" -> "Chain", 
  "Nodes" -> Association["1" -> Association["Type" -> "Convolution", 
      "Arrays" -> Association["Weights" -> NeuralNetworks`TensorT[{3, NeuralNetworks`SizeT, 3, 3}, 
          NeuralNetworks`RealT], "Bia ... rks`SizeT, 
       NeuralNetworks`SizeT, NeuralNetworks`SizeT}, NeuralNetworks`AtomT]], 
  "Outputs" -> Association["Output" -> NeuralNetworks`TensorT[{3, 28, 28}, NeuralNetworks`RealT]]], 
 Association["Version" -> "14.1.1", "Unstable" -> False]]
```

### Properties & Relations (1)

During ordinary evaluation, ``BatchNormalizationLayer`` computes the following function:

```wl
In[1]:=
batchNormFunction = Function[Block[{sd = Sqrt[#MovingVariance + #Epsilon]}, 
	(#2 * #Scaling / sd ) + (#Biases - (#Scaling * #MovingMean) / sd)]];
```

Evaluate a ``BatchNormalizationLayer`` on an example vector containing a single channel:

```wl
In[2]:=
params = <|"Scaling" -> {3}, "Biases" -> {2}, "MovingMean" -> {1}, "MovingVariance" -> {2}, "Epsilon" -> 0.001|>;
layer = BatchNormalizationLayer@@Normal[params]

Out[2]=
BatchNormalizationLayer[Association["Type" -> "BatchNormalization", 
  "Arrays" -> Association["Scaling" -> RawArray["Real32", {3.}], 
    "Biases" -> RawArray["Real32", {2.}], "MovingMean" -> RawArray["Real32", {1.}], 
    "MovingVariance" -> RawA ...  
   Association["Output" -> NeuralNetworks`TensorT[{1}, NeuralNetworks`TensorT[
       NeuralNetworks`ListT[NeuralNetworks`NaturalT, NeuralNetworks`SizeT], 
       NeuralNetworks`RealT]]]], Association["Version" -> "14.1.1", "Unstable" -> False]]

In[3]:= layer[{5}]//Normal

Out[3]= {10.4832}
```

Manually compute the same result:

```wl
In[4]:= batchNormFunction[params, {5}]

Out[4]= {10.4832}
```

### Possible Issues (3)

Specifying negative values for the ``"MovingVariance"`` parameter causes numerical errors during evaluation:

```wl
In[1]:= batchnorm = NetInitialize[BatchNormalizationLayer["Input" -> {1, 2, 2}, "MovingVariance" -> {-2}]]

Out[1]=
BatchNormalizationLayer[Association["Type" -> "BatchNormalization", 
  "Arrays" -> Association["Scaling" -> RawArray["Real32", {1.}], 
    "Biases" -> RawArray["Real32", {0.}], "MovingMean" -> RawArray["Real32", {0.}], 
    "MovingVariance" -> RawA ... puts" -> Association["Input" -> NeuralNetworks`TensorT[{1, 2, 2}, NeuralNetworks`AtomT]], 
  "Outputs" -> Association["Output" -> NeuralNetworks`TensorT[{1, 2, 2}, NeuralNetworks`RealT]]], 
 Association["Version" -> "14.1.1", "Unstable" -> False]]

In[2]:= batchnorm[RandomReal[1, {1, 2, 2}]]
```

BatchNormalizationLayer::netnan: A floating-point overflow, underflow, or division by zero occurred while evaluating the net.

```wl
Out[2]= $Failed
```

---

``BatchNormalizationLayer`` cannot be initialized until all its input and output dimensions are known:

```wl
In[1]:= NetInitialize@BatchNormalizationLayer[]
```

NetInitialize::nninit: Cannot initialize net: unspecified or partially specified shape for array "Scaling".

```wl
Out[1]= $Failed

In[2]:= NetInitialize@BatchNormalizationLayer["Input" -> 3]

Out[2]=
BatchNormalizationLayer[Association["Type" -> "BatchNormalization", 
  "Arrays" -> Association["Scaling" -> RawArray["Real32", {1., 1., 1.}], 
    "Biases" -> RawArray["Real32", {0., 0., 0.}], "MovingMean" -> RawArray["Real32", {0., 0., 0.}], 
     ...  {}], 
  "Inputs" -> Association["Input" -> NeuralNetworks`TensorT[{3}, NeuralNetworks`AtomT]], 
  "Outputs" -> Association["Output" -> NeuralNetworks`TensorT[{3}, NeuralNetworks`RealT]]], 
 Association["Version" -> "14.1.1", "Unstable" -> False]]
```

---

The ``"MovingMean"`` and ``"MovingVariance"`` arrays of ``BatchNormalizationLayer`` cannot be shared:

```wl
In[1]:= BatchNormalizationLayer["MovingMean" -> NetArray["MovingMean"]]
```

BatchNormalizationLayer::noauxsa: Array MovingMean cannot be shared.

```wl
Out[1]= $Failed
```

Create a ``BatchNormalizationLayer`` with shared arrays:

```wl
In[2]:= sharedBatchNorm = NetInsertSharedArrays[BatchNormalizationLayer[]]

Out[2]=
BatchNormalizationLayer[Association["Type" -> "BatchNormalization", 
  "Arrays" -> Association["Scaling" -> NetArray[Association["Array" -> Automatic, 
       "Dimensions" -> Automatic, "Name" -> "Scaling"]], 
    "Biases" -> NetArray[Association[" ... alNetworks`ListT[1, NeuralNetworks`SizeT], 
      NeuralNetworks`RealT], "Biases" -> NeuralNetworks`TensorT[NeuralNetworks`ListT[1, 
       NeuralNetworks`SizeT], NeuralNetworks`RealT]]], Association["Version" -> "14.1.1", 
  "Unstable" -> False]]
```

Train it on some data:

```wl
In[3]:= net = NetTrain[NetChain[{2, sharedBatchNorm, 2, sharedBatchNorm, 2}], {{0, 1} -> {1, 0}, {1, 0} -> {0, 1}}]

Out[3]=
NetChain[Association["Type" -> "Chain", 
  "Nodes" -> Association["1" -> Association["Type" -> "Linear", 
      "Arrays" -> Association["Weights" -> RawArray["Real32", 
          {{0.13277174532413483, 0.12089180946350098}, {-0.08585479855537415, 
 ... rays" -> Association["Biases" -> RawArray["Real32", {-0.11405453830957413, 
       0.27602145075798035}], "Scaling" -> RawArray["Real32", {1.0624027252197266, 
       0.9114811420440674}]]], Association["Version" -> "14.1.1", "Unstable" -> False]]
```

Extract the trained batch normalization layers:

```wl
In[4]:= {batchnorm1, batchnorm2} = NetExtract[net, {{2}, {4}}]

Out[4]=
{BatchNormalizationLayer[Association["Type" -> "BatchNormalization", 
  "Arrays" -> Association["Scaling" -> NetArray[Association["Array" -> Automatic, 
       "Dimensions" -> Automatic, "Name" -> "Scaling"]], 
    "Biases" -> NetArray[Association[ ... ays" -> Association["Biases" -> RawArray["Real32", {-0.11405453830957413, 
       0.27602145075798035}], "Scaling" -> RawArray["Real32", {1.0624027252197266, 
       0.9114811420440674}]]], Association["Version" -> "14.1.1", "Unstable" -> False]]}
```

The ``"Scaling"`` and ``"Biases"`` arrays were shared, but not ``"MovingMean"`` or ``"MovingVariance"`` :

```wl
In[5]:= Normal /@ Information[batchnorm1, "Arrays"]

Out[5]= <|{"MovingMean"} -> {0.125791, -0.169852}, {"MovingVariance"} -> {0.000035283, 0.00706169}, NetArray["Biases"] -> {-0.114055, 0.276021}, NetArray["Scaling"] -> {1.0624, 0.911481}|>

In[6]:= Normal /@ Information[batchnorm2, "Arrays"]

Out[6]= <|{"MovingMean"} -> {0.423157, -0.183322}, {"MovingVariance"} -> {2.00721, 0.00282327}, NetArray["Biases"] -> {-0.114055, 0.276021}, NetArray["Scaling"] -> {1.0624, 0.911481}|>
```

## See Also

* [`DropoutLayer`](https://reference.wolfram.com/language/ref/DropoutLayer.en.md)
* [`NetEvaluationMode`](https://reference.wolfram.com/language/ref/NetEvaluationMode.en.md)
* [`ConvolutionLayer`](https://reference.wolfram.com/language/ref/ConvolutionLayer.en.md)
* [`PoolingLayer`](https://reference.wolfram.com/language/ref/PoolingLayer.en.md)
* [`NormalizationLayer`](https://reference.wolfram.com/language/ref/NormalizationLayer.en.md)
* [`LocalResponseNormalizationLayer`](https://reference.wolfram.com/language/ref/LocalResponseNormalizationLayer.en.md)
* [`NetChain`](https://reference.wolfram.com/language/ref/NetChain.en.md)
* [`NetGraph`](https://reference.wolfram.com/language/ref/NetGraph.en.md)
* [`NetInitialize`](https://reference.wolfram.com/language/ref/NetInitialize.en.md)
* [`NetTrain`](https://reference.wolfram.com/language/ref/NetTrain.en.md)
* [`NetExtract`](https://reference.wolfram.com/language/ref/NetExtract.en.md)

## Tech Notes

* [Neural Networks in the Wolfram Language](https://reference.wolfram.com/language/tutorial/NeuralNetworksOverview.en.md)

## Related Guides

* [Neural Network Layers](https://reference.wolfram.com/language/guide/NeuralNetworkLayers.en.md)

## History

* [Introduced in 2016 (11.0)](https://reference.wolfram.com/language/guide/SummaryOfNewFeaturesIn110.en.md) \| [Updated in 2020 (12.1)](https://reference.wolfram.com/language/guide/SummaryOfNewFeaturesIn121.en.md)