LearningRateMultipliers

LearningRateMultipliers

is an option for NetTrain that specifies learning rate multipliers to apply to specific layers within a NetChain, NetGraph, etc.

Details

  • With the default value of LearningRateMultipliers->Automatic, all layers learn at the same rate.
  • LearningRateMultipliers->{rule1,rule2,} specifies a set of rules that will be used to determine learning rate multipliers for every trainable array in the net.
  • In LearningRateMultipliers->{rule1,rule2,}, each of the rulei can be of the following forms:
  • "layer"ruse multiplier r for a named layer or subnetwork
    nruse multiplier r for the nth layer
    m;;nruse multiplier r for layers m through n
    {layer,"array"}ruse multiplier r for a particular array within a layer
    {part1,part2,}ruse multiplier r for a nested layer
    _ruse multiplier r for all layers
  • If r is a positive number, it specifies a multiplier to apply to the global learning rate chosen by the training method to determine the learning rate for the given layer or array.
  • If r is zero or None, it specifies that the layer or array should not undergo training and will be left unchanged by NetTrain.
  • For each trainable array, the rate used is given by the first matching rule, or 1 if no rule matches.
  • Rules that specify a subnet (e.g. a nested NetChain or NetGraph) apply to all layers and arrays within that subnet.
  • LearningRateMultipliers->{layer->None} can be used to "freeze" a specific layer.
  • LearningRateMultipliers->{layer->1,_->None} can be used to "freeze" all layers except for a specific layer.
  • The hierarchical specification used by LearningRateMultipliers to refer to parts of a net is equivalent to that used by NetExtract and NetReplacePart.

Examples

open allclose all

Basic Examples  (1)

Create and initialize a net with three layers, but train only the last layer:

In[1]:=
Click for copyable input
Out[1]=
In[2]:=
Click for copyable input
Out[2]=

Evaluate the trained net on an input:

In[3]:=
Click for copyable input
Out[3]=

The first layer of the initial net started with zero biases:

In[4]:=
Click for copyable input
Out[4]=

The biases of the first layer remain zero in the trained net:

In[5]:=
Click for copyable input
Out[5]=

The biases of the third layer have been trained:

In[6]:=
Click for copyable input
Out[6]=

Applications  (1)

See Also

NetTrain  NetChain  NetGraph  NetReplacePart  NetExtract  NetPort

Introduced in 2017
(11.1)