NetPortGradient

NetPortGradient["port"]

represents the gradient of the output of a net with respect to the value of the specified input port.

NetPortGradient["param"]

represents the gradient of the output with respect to a learned parameter named param.

NetPortGradient[{layer1,layer2,,"param"}]

represents the gradient with respect to a parameter at a specific position in a net.

Details

  • net[data,NetPortGradient[iport]] can be used to obtain the gradient with respect to the input port iport for a net applied to the specified data.
  • net[<|,NetPortGradient[oport]ograd|>,NetPortGradient[iport]] can be used to impose a gradient at an output port oport that will be backpropogated to calculate the gradient at iport.
  • NetPortGradient can be used to calculate gradients with respect to both learned parameters of a network and inputs to the network.
  • For a net with a single scalar output port, the gradient returned when using NetPortGradient is the gradient in the ordinary mathematical sense: for a net computing , where is the array whose gradient is being calculated and represents all other arrays, the gradient is an array of the same rank as , whose components are given by . Intuitively, the gradient at a specific value of is the "best direction" in which to perturb if the goal is to increase , where the magnitude of the gradient is proportional to the sensitivity of to changes in .
  • For a net with vector or array outputs, the gradient returned when using NetPortGradient is the ordinary gradient of the scalar sum of all outputs. Imposing a gradient at the output using the syntax <|,NetPortGradient[oport]ograd|> is equivalent to replacing this scalar sum with a dot product between the output and ograd.
  • Using NetPortGradient to calculate the gradient with respect to the learned parameters of the net will return the sum of the gradient over the input batch.

Examples

open all close all

Basic Examples  (3)

Create an elementwise layer and return the derivative of the input with respect to the output:

In[1]:=
Click for copyable input
Out[1]=
In[2]:=
Click for copyable input
Out[2]=

The derivative for negative values is zero:

In[3]:=
Click for copyable input
Out[3]=

Randomly initialize a LinearLayer and return the derivative of the output with respect to the input for a specific input value:

In[1]:=
Click for copyable input
Out[1]=
In[2]:=
Click for copyable input
Out[2]=

Compare with a naive numeric calculation:

In[3]:=
Click for copyable input
Out[3]=

Create a binary cross-entropy loss layer:

In[1]:=
Click for copyable input
Out[1]=

Evaluate the loss for an input and target:

In[2]:=
Click for copyable input
Out[2]=

The derivative of the input is negative because the input is below the target (increasing the input would lower the loss):

In[3]:=
Click for copyable input
Out[3]=

The derivative of the input is positive when the input is above the target:

In[4]:=
Click for copyable input
Out[4]=

The derivative is zero when the input and target are equal:

In[5]:=
Click for copyable input
Out[5]=

Scope  (2)

Properties & Relations  (1)

Possible Issues  (1)

Introduced in 2017
(11.1)
|
Updated in 2018
(11.3)