TrainingProgressFunction

TrainingProgressFunction

is an option for NetTrain that specifies a function to run periodically during training.

Details

  • With the default value of TrainingProgressFunction->None, no function is run.
  • Setting TrainingProgressFunction->f specifies that f[assoc] is evaluated after every training round, where assoc is an association with the following keys:
  • "AbsoluteBatch"total number of batches processed so far
    "Batch"current batch number within this round
    "BatchData"the most recent batch of data used to train the net
    "BatchesPerRound"the number of batches contained in a single round
    "BatchesPerSecond"the current training rate in batches per second
    "BatchLoss"average loss of most recent batch
    "BatchLossList"list of batch losses for each batch update so far
    "BatchMeasurements"association of training measurements after the last batch update
    "BatchMeasurementsLists"list of training measurements associations for each batch update so far
    "BatchSize"the number of inputs contained in a batch
    "BestValidationRound"the training round corresponding to the current best net
    "CheckpointingFiles"list of checkpointing files generated so far
    "Event"the last event that occurred
    "ExampleLosses"losses taken by each example during training
    "ExamplesPerSecond"the training rate in input examples per second
    "ExamplesProcessed"total number of examples processed so far
    "Gradients"association between weight position within net and current gradient
    "GradientsRMS"root mean square of the weight gradients
    "GradientsVector"vector formed by flattening the current value of all weight gradients together
    "InitialLearningRate"the learning rate at the start of training
    "LearningRate"the current learning rate
    "MeanBatchesPerSecond"the mean number of batches processed per second
    "MeanExamplesPerSecond"the mean number of input examples processed per second
    "Net"current, partially trained network
    "OptimizationMethod"the name of the optimization method used
    "ProgressFraction"progress represented as a number between 0 and 1
    "Round"current round number
    "RoundLoss"average loss of most recent round
    "RoundLossList"list of round losses for each round so far
    "RoundMeasurements"association of training measurements for the training set after the last training round
    "RoundMeasurementsLists"list of training measurements associations for each round so far
    "TargetDevice"the device used for training
    "TimeElapsed"time elapsed since training began, in seconds
    "TimeRemaining"estimated time remaining, in seconds
    "TotalBatches"maximum number of training batches
    "TotalRounds"maximum number of training rounds
    "ValidationLoss"most recent validation loss
    "ValidationLossList"list of validation losses for each validation measurement so far
    "ValidationMeasurements"association of training measurements for the validation set
    "ValidationMeasurementsLists"list of training measurements associations for each validation measurement so far
    "Weights"association of current value of all weights
    "WeightsRMS"root mean square of the weights
    "WeightsLearningRateMultipliers"an association of the learning rate multiplier used for each weight
    "WeightsVector"vector formed by flattening the current value of all weights together
  • The keys "ValidationLoss", "LowestValidationLoss", etc. are applicable only in the case that the option ValidationSet was specified to NetTrain.
  • Setting TrainingProgressFunction->{f,"Interval"->Quantity[n,"unit"]} specifies the interval at which to apply f. Possible forms of "unit" include:
  • "Rounds"net training rounds
    "Batches"training data batches
    "Seconds","Minutes","Hours"absolute time
  • The suboption {f,,"MinimumInterval"n} specifies that f should not be applied more frequently than once every n seconds. If unspecified, there is no limit on how frequently f is applied.
  • Setting TrainingProgressFunction->{spec1,spec2,} specifies multiple functions to evaluate, which can have different intervals.

Examples

Basic Examples  (1)

Use TrainingProgressFunction to append information about the state of training to a file. Create a log file:

In[1]:=
Click for copyable input
Out[1]=

Define functions to append the batch number and loss to the log file:

In[2]:=
Click for copyable input

Define the training data and perform training:

In[3]:=
Click for copyable input
In[4]:=
Click for copyable input

Read the log file:

In[5]:=
Click for copyable input
Out[5]=

Put the saved data into a Dataset:

In[6]:=
Click for copyable input
Out[6]=

Plot the loss over value over time:

In[7]:=
Click for copyable input
Out[7]=
Introduced in 2017
(11.1)
|
Updated in 2019
(12.0)