---
title: "NetInitialize"
language: "en"
type: "Symbol"
summary: "NetInitialize[net] gives a net in which all uninitialized learnable parameters in net have been given initial values. NetInitialize[net, All] gives a net in which all learnable parameters have been given initial values."
keywords: 
- layer initialization
- random initialization
- xavier initialization
- glorot initialization
- orthogonal initialization
- random initialization of net
canonical_url: "https://reference.wolfram.com/language/ref/NetInitialize.html"
source: "Wolfram Language Documentation"
related_guides: 
  - 
    title: "Neural Network Operations"
    link: "https://reference.wolfram.com/language/guide/NeuralNetworkOperations.en.md"
  - 
    title: "Neural Networks"
    link: "https://reference.wolfram.com/language/guide/NeuralNetworks.en.md"
related_functions: 
  - 
    title: "NetChain"
    link: "https://reference.wolfram.com/language/ref/NetChain.en.md"
  - 
    title: "NetGraph"
    link: "https://reference.wolfram.com/language/ref/NetGraph.en.md"
  - 
    title: "NetTrain"
    link: "https://reference.wolfram.com/language/ref/NetTrain.en.md"
  - 
    title: "NetExtract"
    link: "https://reference.wolfram.com/language/ref/NetExtract.en.md"
  - 
    title: "RandomVariate"
    link: "https://reference.wolfram.com/language/ref/RandomVariate.en.md"
related_tutorials: 
  - 
    title: "Neural Networks in the Wolfram Language"
    link: "https://reference.wolfram.com/language/tutorial/NeuralNetworksOverview.en.md"
---
[EXPERIMENTAL]

# NetInitialize

NetInitialize[net] gives a net in which all uninitialized learnable parameters in net have been given initial values.

NetInitialize[net, All] gives a net in which all learnable parameters have been given initial values.

## Details and Options

* ``NetInitialize[net, All]`` overwrites any existing training or preset learnable parameters in ``net``.

* ``NetInitialize`` typically assigns random values to parameters representing weights and zero to parameters representing biases.

* The following optional parameters can be included:

|               |           |                                          |
| ------------- | --------- | ---------------------------------------- |
| Method        | "Kaiming" | which initialization method to use       |
| RandomSeeding | 1234      | seeding of pseudorandom number generator |

* Possible settings for ``Method`` include:

|     |     |
| --- | --- |
| "Kaiming" | choose weights to preserve variance of arrays when propagated through layers, with the method introduced by Kaiming He et al. (2015) |
| "Xavier" | choose weights to preserve variance of arrays when propagated through layers, with the method introduced by Xavier Glorot et al. (2014) |
| "Orthogonal" | choose weights to be orthogonal matrices |
| "Random" | choose weights from a given univariate distribution |
| "Identity" | choose weights so as to preserve components of arrays when propogated through affine layers |

* Suboptions for specific methods can be specified using ``Method -> {"method", opt1 -> val1, …}``.

* For the methods ``"Kaiming"`` and ``"Xavier"``, the following suboption is supported:

"Distribution"	"Normal"	either ``"Normal"`` or ``"Uniform"``

* For the method ``"Random"``, the following suboptions are supported:

|           |                          |                                                          |
| --------- | ------------------------ | -------------------------------------------------------- |
| "Weights" | NormalDistribution[0, 1] | random distribution to use to initialize weight matrices |
| "Biases"  | None                     | random distribution to use to initialize bias vectors    |

* For the method ``"Identity"``, the following suboption is supported:

"Distribution"	[`NormalDistribution`](https://reference.wolfram.com/language/ref/NormalDistribution.en.md)[0, 0.01]	random distribution used to add noise to the initial identity matrices in order to break symmetries

* For any suboption that expects a distribution, a numeric value ``stddev`` can be specified and is taken to mean ``NormalDistribution[0, stddev]``.

* By default, all methods initialize bias vectors to zero.

* Possible settings for ``RandomSeeding`` include:

|           |                                                        |
| --------- | ------------------------------------------------------ |
| Automatic | automatically reseed every time the function is called |
| Inherited | use externally seeded random numbers                   |
| seed      | use an explicit integer or strings as a seed           |

---

## Examples (8)

### Basic Examples (1)

Create an uninitialized layer:

```wl
In[1]:= dot = LinearLayer[2, "Input" -> 5]

Out[1]=
LinearLayer[Association["Type" -> "Linear", 
  "Arrays" -> Association["Weights" -> NeuralNetworks`TensorT[{2, 5}, NeuralNetworks`RealT], 
    "Biases" -> NeuralNetworks`Nullable[NeuralNetworks`TensorT[{2}, NeuralNetworks`RealT]]], 
  "Parameters"  ... {5}], 
  "Inputs" -> Association["Input" -> NeuralNetworks`TensorT[{5}, NeuralNetworks`AtomT]], 
  "Outputs" -> Association["Output" -> NeuralNetworks`TensorT[{2}, NeuralNetworks`RealT]]], 
 Association["Version" -> "14.1.1", "Unstable" -> False]]
```

Initialize the layer with random weights:

```wl
In[2]:= dot = NetInitialize[dot]

Out[2]=
LinearLayer[Association["Type" -> "Linear", 
  "Arrays" -> Association["Weights" -> RawArray["Real32", 
      {{-0.22733454406261444, -0.03157411143183708, -0.7128121852874756, 0.6871856451034546, 
        1.1976466178894043}, {-0.5889045596122742, ... {5}], 
  "Inputs" -> Association["Input" -> NeuralNetworks`TensorT[{5}, NeuralNetworks`AtomT]], 
  "Outputs" -> Association["Output" -> NeuralNetworks`TensorT[{2}, NeuralNetworks`RealT]]], 
 Association["Version" -> "14.1.1", "Unstable" -> False]]
```

Extract the new initialized weights:

```wl
In[3]:= NetExtract[dot, "Weights"]

Out[3]=
RawArray["Real32", {{-0.22733454406261444, -0.03157411143183708, -0.7128121852874756, 
   0.6871856451034546, 1.1976466178894043}, {-0.5889045596122742, -0.48950430750846863, 
   0.06392824649810791, 0.04658425226807594, -0.7482372522354126}}]
```

### Scope (1)

Specify ``"Random"`` initialization, using normal distributions with a standard deviation of 2 for both weights and biases:

```wl
In[1]:= net = NetInitialize[LinearLayer[200, "Input" -> 2], Method -> {"Random", "Weights" -> 2, "Biases" -> 2}]

Out[1]=
LinearLayer[Association["Type" -> "Linear", 
  "Arrays" -> Association["Weights" -> CompressedData["«2259»"], 
    "Biases" -> CompressedData["«1197»"]], "Parameters" -> Association["OutputDimensions" -> {200}, 
    "$OutputSize" -> 200, "$InputSiz ... }], 
  "Inputs" -> Association["Input" -> NeuralNetworks`TensorT[{2}, NeuralNetworks`AtomT]], 
  "Outputs" -> Association["Output" -> NeuralNetworks`TensorT[{200}, NeuralNetworks`RealT]]], 
 Association["Version" -> "14.1.1", "Unstable" -> False]]
```

Extract and plot the initialized weights and biases:

```wl
In[2]:= Histogram[Flatten@NetExtract[net, "Weights"]]

Out[2]= [image]

In[3]:= Histogram[Flatten@NetExtract[net, "Biases"]]

Out[3]= [image]
```

### Options (1)

#### Method (1)

Define a network:

```wl
In[1]:= net = NetChain[{LinearLayer[200], ElementwiseLayer[Ramp], LinearLayer[200]}, "Input" -> 50]

Out[1]=
NetChain[Association["Type" -> "Chain", 
  "Nodes" -> Association["1" -> Association["Type" -> "Linear", 
      "Arrays" -> Association["Weights" -> NeuralNetworks`TensorT[{200, 50}, NeuralNetworks`RealT], 
        "Biases" -> NeuralNetworks`Nullab ... }, 
  "Inputs" -> Association["Input" -> NeuralNetworks`TensorT[{50}, NeuralNetworks`AtomT]], 
  "Outputs" -> Association["Output" -> NeuralNetworks`TensorT[{200}, NeuralNetworks`RealT]]], 
 Association["Version" -> "14.1.1", "Unstable" -> False]]
```

Initialize the net using the ``"Xavier"`` initialization:

```wl
In[2]:= net2 = NetInitialize[net, Method -> "Xavier"]

Out[2]=
NetChain[Association["Type" -> "Chain", 
  "Nodes" -> Association["1" -> Association["Type" -> "Linear", 
      "Arrays" -> Association["Weights" -> CompressedData["«52293»"], "Biases" -> CompressedData[
          "\n1:eJxTTMoPSmNiYGAo5gASQYnljkVFi ... }, 
  "Inputs" -> Association["Input" -> NeuralNetworks`TensorT[{50}, NeuralNetworks`AtomT]], 
  "Outputs" -> Association["Output" -> NeuralNetworks`TensorT[{200}, NeuralNetworks`RealT]]], 
 Association["Version" -> "14.1.1", "Unstable" -> False]]
```

Specify that the ``"Xavier"`` method will sample from a uniform distribution:

```wl
In[3]:= net3 = NetInitialize[net, Method -> {"Xavier", "Distribution" -> "Uniform"}];
```

Plot a histogram of the weights in the first layer:

```wl
In[4]:= Histogram[Flatten@NetExtract[net3, {1, "Weights"}]]

Out[4]= [image]
```

Specify that the ``"Xavier"`` method will sample from a normal distribution:

```wl
In[5]:= net3 = NetInitialize[net, Method -> {"Xavier", "Distribution" -> "Normal"}];
```

Plot a histogram of the weights in the first layer:

```wl
In[6]:= Histogram[Flatten@NetExtract[net3, {1, "Weights"}]]

Out[6]= [image]
```

### Properties & Relations (2)

``NetTrain`` will automatically call ``NetInitialize`` before training begins. The weights and biases of a simple layer are initialized before training:

```wl
In[1]:= net = LinearLayer[1, "Input" -> 1]

Out[1]=
LinearLayer[Association["Type" -> "Linear", 
  "Arrays" -> Association["Weights" -> NeuralNetworks`TensorT[{1, 1}, NeuralNetworks`RealT], 
    "Biases" -> NeuralNetworks`Nullable[NeuralNetworks`TensorT[{1}, NeuralNetworks`RealT]]], 
  "Parameters"  ... {1}], 
  "Inputs" -> Association["Input" -> NeuralNetworks`TensorT[{1}, NeuralNetworks`AtomT]], 
  "Outputs" -> Association["Output" -> NeuralNetworks`TensorT[{1}, NeuralNetworks`RealT]]], 
 Association["Version" -> "14.1.1", "Unstable" -> False]]

In[2]:= NetExtract[net, "Weights"]

Out[2]= Automatic

In[3]:= NetExtract[net, "Biases"]

Out[3]= Automatic
```

Extract the weights and biases after training:

```wl
In[4]:= net = NetTrain[net, {{1} -> {2}, {2} -> {3}}]

Out[4]=
LinearLayer[Association["Type" -> "Linear", 
  "Arrays" -> Association["Weights" -> RawArray["Real32", {{1.000191330909729}}], 
    "Biases" -> RawArray["Real32", {0.9996939301490784}]], 
  "Parameters" -> Association["OutputDimensions" -> {1}, "$O ... {1}], 
  "Inputs" -> Association["Input" -> NeuralNetworks`TensorT[{1}, NeuralNetworks`AtomT]], 
  "Outputs" -> Association["Output" -> NeuralNetworks`TensorT[{1}, NeuralNetworks`RealT]]], 
 Association["Version" -> "14.1.1", "Unstable" -> False]]

In[5]:= NetExtract[net, "Weights"]

Out[5]= RawArray["Real32", {{1.000191330909729}}]

In[6]:= NetExtract[net, "Biases"]

Out[6]= RawArray["Real32", {0.9996939301490784}]
```

---

Create a net that maps vectors of length 1 to vectors of length 1:

```wl
In[1]:= net = NetChain[{5, 100, 5, 1}, "Input" -> 1];
```

Initialize the net using the ``"Identity"`` method, which results in a net that attempts to preserve the components of arrays as they pass through linear layers:

```wl
In[2]:= net1 = NetInitialize[net, Method -> "Identity"]

Out[2]=
NetChain[Association["Type" -> "Chain", 
  "Nodes" -> Association["1" -> Association["Type" -> "Linear", 
      "Arrays" -> Association["Weights" -> RawArray["Real32", {{0.9949166178703308}, 
           {-0.0007060185889713466}, {-0.015938965603709 ...  "Inputs" -> Association["Input" -> NeuralNetworks`TensorT[{1}, 
      NeuralNetworks`AtomT]], "Outputs" -> 
   Association["Output" -> NeuralNetworks`TensorT[{1}, NeuralNetworks`RealT]]], 
 Association["Version" -> "14.1.1", "Unstable" -> False]]
```

Visualize the output of the net as a function of its input:

```wl
In[3]:= Plot[net1[x]//Normal, {x, -1, 1}, PlotRange -> {-1, 1}]

Out[3]= [image]
```

Initializing the net using other methods produces a random linear function:

```wl
In[4]:=
net2 = NetInitialize[net, Method -> "Xavier"];
Plot[net2[x]//Normal, {x, -1, 1}, PlotRange -> {-1, 1}]

Out[4]= [image]
```

### Possible Issues (2)

Parameters belonging to certain layers have a fixed initialization method, independent of the ``Method`` option in ``NetInitialize`` :

```wl
In[1]:= batch = BatchNormalizationLayer["Input" -> {2, 3, 3}];

In[2]:= AssociationMap[Normal@NetExtract[NetInitialize[batch, Method -> #], "MovingVariance"]&, {"Xavier", "Orthogonal", "Identity"}]

Out[2]= <|"Xavier" -> {1., 1.}, "Orthogonal" -> {1., 1.}, "Identity" -> {1., 1.}|>
```

---

By default, ``NetInitialize`` uses ``RandomSeeding -> 1234``, which will use the same random seed to initialize the net when ``NetInitialize`` is called repeatedly:

```wl
In[1]:= Normal@NetExtract[NetInitialize[LinearLayer[1, "Input" -> 1]], "Weights"]

Out[1]= {{-0.508336}}

In[2]:= Normal@NetExtract[NetInitialize[LinearLayer[1, "Input" -> 1]], "Weights"]

Out[2]= {{-0.508336}}
```

Use ``RandomSeeding -> Automatic`` to ensure that repeated calls produce different initializations:

```wl
In[3]:= Normal@NetExtract[NetInitialize[LinearLayer[1, "Input" -> 1], RandomSeeding -> Automatic], "Weights"]

Out[3]= {{-1.59834}}

In[4]:= Normal@NetExtract[NetInitialize[LinearLayer[1, "Input" -> 1], RandomSeeding -> Automatic], "Weights"]

Out[4]= {{-1.63972}}
```

### Neat Examples (1)

Explore how the magnitude of the values used for the weights and biases affects a simple nonlinear net that maps single values to vectors of length 8:

```wl
In[1]:=
Manipulate[Dynamic@Module[{net = NetChain[{30, Tanh, 8, Tanh}, "Input" -> "Real"], initNet}, 
	initNet = NetInitialize[net, Method -> {"Random", "Weights" -> weights, "Biases" -> biases}];
	ListLinePlot[
	Transpose @ initNet @ Range[-2, 2, .01], 
	PlotRange -> {-1, 1}, Ticks -> {None, Automatic}, 
	ImageSize -> Medium]
	], 
	{{weights, 1}, 0, 4}, 
	{{biases, 0}, 0, 4}]

Out[1]= DynamicModule[«8»]
```

## See Also

* [`NetChain`](https://reference.wolfram.com/language/ref/NetChain.en.md)
* [`NetGraph`](https://reference.wolfram.com/language/ref/NetGraph.en.md)
* [`NetTrain`](https://reference.wolfram.com/language/ref/NetTrain.en.md)
* [`NetExtract`](https://reference.wolfram.com/language/ref/NetExtract.en.md)
* [`RandomVariate`](https://reference.wolfram.com/language/ref/RandomVariate.en.md)

## Tech Notes

* [Neural Networks in the Wolfram Language](https://reference.wolfram.com/language/tutorial/NeuralNetworksOverview.en.md)

## Related Guides

* [Neural Network Operations](https://reference.wolfram.com/language/guide/NeuralNetworkOperations.en.md)
* [Neural Networks](https://reference.wolfram.com/language/guide/NeuralNetworks.en.md)

## History

* [Introduced in 2016 (11.0)](https://reference.wolfram.com/language/guide/SummaryOfNewFeaturesIn110.en.md) \| [Updated in 2020 (12.1)](https://reference.wolfram.com/language/guide/SummaryOfNewFeaturesIn121.en.md) ▪ [2022 (13.1)](https://reference.wolfram.com/language/guide/SummaryOfNewFeaturesIn131.en.md)