BatchSize

BatchSize

is an option for NetTrain and related functions that specifies the size of a batch of examples to process together.

Details

  • Setting BatchSizen specifies that n examples should be processed together.
  • The default setting of BatchSize->Automatic specifies that the BatchSize should be chosen based on factors such as the available GPU or system memory, etc.
  • BatchSize can be specified when evaluating a net by writing net[input,BatchSize->n]. This can be important when GPU computation is also specified via TargetDevice->"GPU", as memory is typically more limited in this case.
  • For nets that contain dynamic dimensions (usually specified as "Varying"), the BatchSize is usually automatically chosen to be 16.
  • The BatchSize used when training can be obtained from a NetTrainResultsObject via the "BatchSize" property.

Examples

open allclose all

Basic Examples  (1)

Define a single-layer neural network and train this network with a BatchSize of 300:

Predict the value of a new input:

Properties & Relations  (1)

NetTrain typically processes more inputs per second when larger batch sizes are used, at the cost of extra memory usage. Training a simple net with a BatchSize of 1:

Using a BatchSize of 1000:

This can also be seen by returning the mean examples per second processed by NetTrain:

Depending on the task, larger batch sizes provide only marginal benefit to final net quality and may exhaust the available memory when training on a GPU. Furthermore, a given amount of training time may be better spent on making more frequent updates using smaller batches, as long as the batch size is still large enough to produce a low-variance estimate of the gradient.

Introduced in 2016
 (11.0)
 |
Updated in 2018
 (11.3)