represents the gradient of the output of a net with respect to the value of the specified input port.

represents the gradient of the output with respect to a learned parameter named param.

represents the gradient with respect to a parameter at a specific position in a net.

# Details • net[data,NetPortGradient[iport]] can be used to obtain the gradient with respect to the input port iport for a net applied to the specified data.
• NetPortGradient can be used to calculate gradients with respect to both learned parameters of a network and inputs to the network.
• For a net with a single scalar output port, the gradient returned when using NetPortGradient is the gradient in the ordinary mathematical sense: for a net computing , where is the array whose gradient is being calculated and represents all other arrays, the gradient is an array of the same rank as , whose components are given by . Intuitively, the gradient at a specific value of is the "best direction" in which to perturb if the goal is to increase , where the magnitude of the gradient is proportional to the sensitivity of to changes in .
• For a net with vector or array outputs, the gradient returned when using NetPortGradient is the ordinary gradient of the scalar sum of all outputs. Imposing a gradient at the output using the syntax <|,NetPortGradient[oport]ograd|> is equivalent to replacing this scalar sum with a dot product between the output and ograd.
• Using NetPortGradient to calculate the gradient with respect to the learned parameters of the net will return the sum of the gradient over the input batch.

# Examples

open allclose all

## Basic Examples(3)

Create an elementwise layer and return the derivative of the input with respect to the output:

The derivative for negative values is zero:

Randomly initialize a LinearLayer and return the derivative of the output with respect to the input for a specific input value:

Compare with a naive numeric calculation:

Create a binary cross-entropy loss layer:

Evaluate the loss for an input and target:

The derivative of the input is negative because the input is below the target (increasing the input would lower the loss):

The derivative of the input is positive when the input is above the target:

The derivative is zero when the input and target are equal:

## Scope(2)

Randomly initialize a chain of linear layers:

Return the derivative of the output with respect to the weights and biases of the first layer, for a specific input value:

Calculate the gradient at the input:

Impose a gradient at the output:

Chain a PoolingLayer to a PartLayer that extracts the red color channel:

Calculate the input gradient on a test image:

Visualize the gradient, which shows that the total output of this net would only be increased if pixels are reddened that are far from the existing red or white areas:

## Properties & Relations(1)

The default gradient assumed at an output array is an array consisting of 1s.

Randomly initialize a chain of linear layers:

Calculate the gradient at the input:

This is equivalent to imposing a gradient at the output that consists of all 1s:

Impose a different gradient at the output:

## Possible Issues(1)

The gradient for any learned parameters in the net will be totaled over a batch of input examples:

Introduced in 2017
(11.1)
|
Updated in 2018
(11.3)