Wolfram Language & System Documentation Center

NetInitialize

gives a net in which all uninitialized learnable parameters in net have been given initial values.

gives a net in which all learnable parameters have been given initial values.

Details and Options

NetInitialize[net,All] overwrites any existing training or preset learnable parameters in net.
NetInitialize typically assigns random values to parameters representing weights and zero to parameters representing biases.
The following optional parameters can be included:
Method "Kaiming" which initialization method to use

RandomSeeding 1234 seeding of pseudorandom number generator
Possible settings for Method include:

	"Kaiming"	choose weights to preserve variance of arrays when propagated through layers, with the method introduced by Kaiming He et al. (2015)
	"Xavier"	choose weights to preserve variance of arrays when propagated through layers, with the method introduced by Xavier Glorot et al. (2014)
	"Orthogonal"	choose weights to be orthogonal matrices
	"Random"	choose weights from a given univariate distribution
	"Identity"	choose weights so as to preserve components of arrays when propogated through affine layers

Suboptions for specific methods can be specified using Method{"method",opt₁val₁,…}.
For the methods "Kaiming" and "Xavier", the following suboption is supported:
"Distribution" "Normal" either "Normal" or "Uniform"
For the method "Random", the following suboptions are supported:

	"Weights"	NormalDistribution[0,1]	random distribution to use to initialize weight matrices
	"Biases"	None	random distribution to use to initialize bias vectors

For the method "Identity", the following suboption is supported:
"Distribution" NormalDistribution[0,0.01] random distribution used to add noise to the initial identity matrices in order to break symmetries
For any suboption that expects a distribution, a numeric value stddev can be specified and is taken to mean NormalDistribution[0,stddev].
By default, all methods initialize bias vectors to zero.
Possible settings for RandomSeeding include:

	Automatic	automatically reseed every time the function is called
	Inherited	use externally seeded random numbers
	seed	use an explicit integer or strings as a seed

Examples

open all close all

Basic Examples (1)

Create an uninitialized layer:

Wolfram Language code: dot = LinearLayer[2, "Input" -> 5]

Initialize the layer with random weights:

Wolfram Language code: dot = NetInitialize[dot]

Extract the new initialized weights:

Wolfram Language code: NetExtract[dot, "Weights"]

Scope (1)

Specify "Random" initialization, using normal distributions with a standard deviation of 2 for both weights and biases:

Wolfram Language code: net = NetInitialize[LinearLayer[200, "Input" -> 2], Method -> {"Random", "Weights" -> 2, "Biases" -> 2}]

Extract and plot the initialized weights and biases:

Wolfram Language code: Histogram[Flatten@NetExtract[net, "Weights"]]

Wolfram Language code: Histogram[Flatten@NetExtract[net, "Biases"]]

Options (1)

Method (1)

Define a network:

Wolfram Language code: net = NetChain[{LinearLayer[200], ElementwiseLayer[Ramp], LinearLayer[200]}, "Input" -> 50]

Initialize the net using the "Xavier" initialization:

Wolfram Language code: net2 = NetInitialize[net, Method -> "Xavier"]

Specify that the "Xavier" method will sample from a uniform distribution:

Wolfram Language code: net3 = NetInitialize[net, Method -> {"Xavier", "Distribution" -> "Uniform"}];

Plot a histogram of the weights in the first layer:

Wolfram Language code: Histogram[Flatten@NetExtract[net3, {1, "Weights"}]]

Specify that the "Xavier" method will sample from a normal distribution:

Wolfram Language code: net3 = NetInitialize[net, Method -> {"Xavier", "Distribution" -> "Normal"}];

Plot a histogram of the weights in the first layer:

Wolfram Language code: Histogram[Flatten@NetExtract[net3, {1, "Weights"}]]

Properties & Relations (2)

NetTrain will automatically call NetInitialize before training begins. The weights and biases of a simple layer are initialized before training:

Wolfram Language code: net = LinearLayer[1, "Input" -> 1]

Wolfram Language code: NetExtract[net, "Weights"]

Wolfram Language code: NetExtract[net, "Biases"]

Extract the weights and biases after training:

Wolfram Language code: net = NetTrain[net, {{1} -> {2}, {2} -> {3}}]

Wolfram Language code: NetExtract[net, "Weights"]

Wolfram Language code: NetExtract[net, "Biases"]

Create a net that maps vectors of length 1 to vectors of length 1:

Wolfram Language code: net = NetChain[{5, 100, 5, 1}, "Input" -> 1];

Initialize the net using the "Identity" method, which results in a net that attempts to preserve the components of arrays as they pass through linear layers:

Wolfram Language code: net1 = NetInitialize[net, Method -> "Identity"]

Visualize the output of the net as a function of its input:

Wolfram Language code: Plot[net1[x]//Normal, {x, -1, 1}, PlotRange -> {-1, 1}]

Initializing the net using other methods produces a random linear function:

Wolfram Language code:

net2 = NetInitialize[net, Method -> "Xavier"];
Plot[net2[x]//Normal, {x, -1, 1}, PlotRange -> {-1, 1}]

Possible Issues (2)

Parameters belonging to certain layers have a fixed initialization method, independent of the Method option in NetInitialize:

Wolfram Language code: batch = BatchNormalizationLayer["Input" -> {2, 3, 3}];

Wolfram Language code:

AssociationMap[Normal@NetExtract[NetInitialize[batch, Method -> #], "MovingVariance"]&, {"Xavier", "Orthogonal", "Identity"}]

By default, NetInitialize uses RandomSeeding1234, which will use the same random seed to initialize the net when NetInitialize is called repeatedly:

Wolfram Language code: Normal@NetExtract[NetInitialize[LinearLayer[1, "Input" -> 1]], "Weights"]

Use RandomSeedingAutomatic to ensure that repeated calls produce different initializations:

Wolfram Language code: Normal@NetExtract[NetInitialize[LinearLayer[1, "Input" -> 1], RandomSeeding -> Automatic], "Weights"]

Neat Examples (1)

Explore how the magnitude of the values used for the weights and biases affects a simple nonlinear net that maps single values to vectors of length 8:

Wolfram Language code:

Manipulate[Dynamic@Module[{net = NetChain[{30, Tanh, 8, Tanh}, "Input" -> "Real"], initNet}, 
	initNet = NetInitialize[net, Method -> {"Random", "Weights" -> weights, "Biases" -> biases}];
	ListLinePlot[
	Transpose @ initNet @ Range[-2, 2, .01], 
	PlotRange -> {-1, 1}, Ticks -> {None, Automatic}, 
	ImageSize -> Medium]
	], 
	{{weights, 1}, 0, 4}, 
	{{biases, 0}, 0, 4}]

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

NetInitialize

Details and Options

Examples

Basic Examples (1)

Scope (1)

Options (1)

Method (1)

Properties & Relations (2)

Possible Issues (2)

Neat Examples (1)

Text

CMS

APA

BibTeX

BibLaTeX

NetInitialize

Details and Options

Examples

Basic Examples (1)

Scope (1)

Options (1)

Method (1)

Properties & Relations (2)

Possible Issues (2)

Neat Examples (1)

See Also

Tech Notes

Related Guides

History

Text

CMS

APA

BibTeX

BibLaTeX