represents a net layer that applies an affine transformation to an input of size c×h0×w0 and returns an output of size c×h×w.

Details and Options

  • SpatialTransformationLayer exposes the following ports for use in NetGraph etc.:
  • "Input"a rank-3 numerical tensor
    "Parameters"a numerical vector of length 6
    "Output"a rank-3 numerical tensor
  • SpatialTransformationLayer[][<|"Input"->in,"Parameters"param|>] explicitly computes the output from applying the layer.
  • SpatialTransformationLayer[][<|"Input"->{in1,in2,},"Parameters"->{param1,param2,}|>] explicitly computes outputs for each of the ini and parami.
  • SpatialTransformationLayer is typically used inside NetGraph to focus the attention of a later convolutional network on the best part of the image to perform a specific task.
  • When it cannot be inferred from other layers in a larger net, the option "Input"->{d1,d2,d3} can be used to fix the input dimensions of SpatialTransformationLayer.
  • The six components of the vector provided to the port "Parameters", {zh,sh,th,sv,zv,tv}, represent the parameters in the affine transformation matrix, where zi represents zoom, si skewness and ti translation, and the subscripts h and v indicate horizontal and vertical. The identity transformation is obtained when "Parameters" is {1,0,0,0,1,0}.


open allclose all

Basic Examples  (2)

Create a SpatialTransformationLayer with output size 30×30:

Click for copyable input

Create a SpatialTransformationLayer that expects an input of size 1×3×3 and returns an output of size 1×2×2:

Click for copyable input

Apply the layer to an input:

Click for copyable input

Scope  (1)

Applications  (1)

Properties & Relations  (1)

See Also

ConvolutionLayer  PoolingLayer  ResizeLayer  ImageAugmentationLayer  NetChain  NetGraph  NetTrain  AffineTransform  ImageResize  ImageTransformation

Introduced in 2017