10.1.2 UnsupervisedNetFit
Unsupervised networks are trained with UnsupervisedNetFit. You can choose between submitting an alreadyexisting unsupervised model, or have a new network initialized by indicating the number of codebook vectors. You can also indicate the number of training iterations. If left out, the default number of iterations (30) will be applied.
Train an unsupervised network.
An existing network can be submitted for more training by setting net equal to the network or its training record. The advantage of submitting the training record is that the information about the first training is combined with the additional training.
UnsupervisedNetFit returns a list of two variables. The first output is the trained unsupervised network. It consists of an object with head UnsupervisedNet. The second output, the training record with head UnsupervisedNetRecord, contains logged information about the training. It can be used to analyze the progress of the training, and to validate the model using the command NetPlot. You can also extract intermediate information from the training as described in Section 7.8, The Training Record.
During the training, intermediate results are displayed in a separate notebook, which is created automatically. After each training iteration the mean distance between the data vectors and the closest codebook vector is written out. Using the options of UnsupervisedNetFit, as described in Section 7.7, Options Controlling Training Results Presentation, you can change the way the training results are presented.
The necessary number of training iterations is strongly dependent on the particular problem. Depending on the number of data vectors, their distribution, and the number of codebook vectors, you might need more iterations. At the end of the training, the decrease of the mean distance is shown in a plot. You can use this plot to decide if more training iterations are necessary.
Sometimes you also receive a warning at the end of the training saying that there is at least one codebook vector that is not used by the data. This indicates that there are nuisance codebook vectors, or dead neurons, that do not have any effect on the training data. In general you do not want any dead codebook vectors, and there are various measures you can take. For example, you can
Reinitialize the unsupervised network using the option UseSOMTrue. This usually gives a better initialization as described later.
Repeat the training from a different initialization. The initialization and training contain some randomness and by repeating these commands you obtain a new realization that might be better.
Change the size of the unsupervised network by changing the number of codebook vectors in the initialization.
Identify the unused codebook vectors with UnUsedNeurons and remove them using NeuronDelete.
UnsupervisedNetFit takes basically the same options as InitializeUnsupervisedNet, but the default values are different.
Options of UnsupervisedNetFit.
The options CriterionPlot, CriterionLog, CriterionLogExtN, ReportFrequency, and MoreTrainingPrompt are common with the other training commands in the Neural Networks package, and they are described in Section 7.7, Options Controlling Training Results Presentation.
By giving new values to SOM and Connect in the call to UnsupervisedFit, it is possible to change the neighbor map of an existing unsupervised network. Examples of how this is done can be found in Section 10.3.3, Adding a SOM to an Existing Unsupervised Network.
The options NeighborStrength and Neighbor only influence the algorithm if the unsupervised network has a neighbor map attached to it. Examples illustrating these options are given in Section 10.4, Change Step Length and Neighbor Influence.
The options Recursive, StepLength, NeighborStrength, and Neighbor are used to modify the training algorithm. They are of a more advanced nature and are further described in this section.
An unsupervised network can be evaluated on one data vector, or a list of data vectors, using the function evaluation rule. The output is a list containing the number of the codebook vector closest to the data vector. This evaluation rule is actually all you need to start using the unsupervised network.
Function evaluation of an unsupervised network.
The input argument x can be a vector containing one input sample or a matrix containing one input sample in each row.
The function evaluation rule also has an option.
Option of the evaluation of an unsupervised network.
The default Automatic is changed to True or False depending on whether or not the unsupervised network has a SOM feature. If so, then the default gives the position of the winning codebook vector within the SOM structure. If you supply the option SOMFalse then the SOM feature is not used in the evaluation, and you receive the number of the winning codebook vector. This is illustrated in Section 10.3.1, Mapping from Two to One Dimensions.
Details and Algorithms
Further described are more advanced options for UnsupervisedNetFit. They can be used to modify the training algorithm from the default version in a way that might better suit your problem.
The codebook vectors can either be adapted in a recursive manner, considering one data sample in each update, or in batch mode where all data is used at each step. The algorithm to be used is indicated by the Recursive option. Also, the algorithm will vary depending on whether or not a neighbor feature is applied.
The recursive algorithm for unsupervised networks (Standard competitive learning rule):
Given N data vectors {}, k=1,...,N, in each update, the following steps are performed.
1. k is chosen randomly from a uniform integer distribution between 1 and N, where the whole range is considered each time this step is executed.
2. The codebook vector closest to , called the winning neuron, or the winning codebook vector, is identified. Its index is indicated by i.
3. The winning codebook vector is changed according to
where n is the iteration number.
4. The described steps are repeated N times in each iteration.
Abbreviations have been used; SL[n] is the StepLength function, and it can be changed by the option with the same name.
If the unsupervised network contains a neighbor feature, then the following recursive algorithm applies.
The recursive algorithm for SOM (Kohonen's algorithm):
Given N data vectors {}, k=1,...N, in each update, the following steps are performed.
1. k is chosen randomly from a uniform integer distribution between 1 and N, where the whole range is considered each time this step is executed.
2. The codebook vector closest to , called the winning neuron, or the winning codebook vector, is identified. Its index is indicated by {,}.
3. All the codebook vectors are changed according to
where n is the iteration number and {,} is the center position of the neighbor matrix, the value given with the SOM option.
4. The described steps are repeated N times in each iteration.
Abbreviations have been used; SL[n] is the StepLength function and NS[n] is the NeighborStrength function. They can both be changed by the options with the same names. NM is the neighbor matrix dictating which codebook vectors are neighbors, and it can also be chosen by the user. All codebook vectors are changed toward the data vector. The neighbor matrix NM should have its minimum at its center element {,}, so that the winning neuron update is most pronounced. Typically, the elements of NM further away from the center take larger values, so that codebook vectors further away from the winner are changed less. One iteration of the stochastic algorithm (that is, n incremented by 1), consists of N updates via Eq. (10.0). Note that due to the fact that k is chosen independently in each update, the data use is indeterminate.
With RecursiveFalse all data is used in each iteration, and the training follows a deterministic scheme where the mean of the update Eq. (10.0), or Eq. (10.0), over all data {}, k=1,...,N is used. In this case, the unsupervised training without a neighbor feature becomes equivalent to what is called a kmeans clustering.
The intended use of UnsupervisedNetFit is to employ the recursive algorithm at the beginning of the training, and then, possibly, take a few steps with the batch algorithm to fine tune the neurons. When UnsupervisedNetFit is used in other ways you should consider changing the two options StepLength and NeighborStrength.
The StepLength option:
The StepLength option has the default value Automatic. Depending on the value of the Recursive option this is changed into one out of two different functions.
Recursive True: Function[n, If[n<5, 0.01, 2./(3+n)]]
Recursive False: Function[n, 1]
In the recursive case, the step length is small during the first iterations so that the codebook vectors find a good orientation. Then, the step length is increased to speed up the convergence. From this higher value, the step length is then slowly decreased again. Convergence can only be guaranteed if the step length converges toward zero.
For a batch update, the step length is set to one for all iterations. This is a good choice if the codebook vectors are close to their optimal values, so that the step becomes small anyway. The equation may become unstable if such a large step length is used when this is not the case.
You can choose other step lengths in two different ways. A constant step length is obtained by giving StepLength a numerical value in the range {0,1}. The other possibility is to set the option to a function that takes the iteration number as input, and delivers a numerical value as output. In Section 10.4, Change Step Length and Neighbor Influence, you find an example showing how the step length may be changed.
The NeighborStrength option:
The NeighborStrength option works similarly to StepLength, but it is active only if there is a neighbor feature adapted to the network. Depending on the Recursive option the default Automatic is changed into:
Recursive True: Function[n, If[n<5, 0.1, (n4)/10.] ];
Recursive False: Function[n, 1000.]
In the recursive case, during the first five iterations, the neighbors of the winning neuron are influenced strongly (low value>0). This helps the network to conform to a nice structure and avoid "knots." Subsequently, the influence on the neighbors gradually decreases (value increases).
When the algorithm is applied in batch mode, the neighbor strength function has a constant value of 1000, which imparts only a negligible influence on the neighboring codebook vectors. Therefore, in batch mode, only the winning neurons are adapted. This is good when the batch mode is used to finetune the final positions of the codebook vectors, after the recursive training has been applied.
A positive constant neighbor strength can be specified using NeighborStrength. You can also use any function that takes the iteration number as input and gives the neighbor strength as output. In Section 10.4, Change Step Length and Neighbor Influence, you find an example showing how the NeighborStrength option can be changed.
The Neighbor option:
As is the case with NeighborStrength, the Neighbor option also has no meaning unless a neighbor feature is attached to the unsupervised network. The Neighbor option lets you specify which neurons, or codebook vectors, are neighbors. There are two standard possibilities that are specified by setting the Neighbor option to NonSymmetric (default) or Symmetric. The nonsymmetric choice gives a stronger connection to the neighbors on one side than on the other side. This should make it easier to avoid "knots" on the map. With these standard choices, neighbor matrices of the correct dimensions are computed internally. The symmetric option gives a neighbor matrix
for a onedimensional network with three codebook vectors. The zero is at the center position, and it corresponds to the winning codebook vector. The matrix has larger values away from the center position in both directions. Since the size of the matrix elements indicates the distance between the codebook vector and the winning neuron, a larger value means that the distance is also larger. The nonsymmetric alternative gives
in a onedimensional network with three codebook vectors. For a twodimensional network of size {3,4} you obtain the following neighbor matrices with the nonsymmetric alternative
and with the symmetric alternative
You can use the Neighbor option and submit your own neighbor matrix. It should then have dimensions {21,21} where {,} are the dimensions of the SOM map. In Section 10.4, Change Step Length and Neighbor Influence, you will find an example showing how the Neighbor option can be changed.
The Connect option:
If Connect is changed to True, then the neighbor matrix is changed so that the SOM network is connected to a ring in the onedimensional case, and into a cylinder in the twodimensional case (in the first of the two dimensions). This holds only if you use one of the two values NonSymmetric or Symmetric for the Neighbor option. If you instead supply your own neighbor matrix, then the Connect option does not have any meaning, and you have to specify the neighbor matrix directly so that it corresponds to a ring. The neighbor matrix generated by setting ConnectTrue and NeighborSymmetric for a onedimensional SOM network with six codebook vectors is
The element values indicate the distance from the winning neuron, which is in the center, that is, it has distance zero from itself. The codebook vectors six positions away from the winner have distance one, the same as the codebook vector in the position next to the winner. Therefore, one end of the line is connected to the other end.
