Training functions and WeightUpdateGlob.m

Next: batchMode.m and batchModeNewton.m Up: Base Modules Previous: TODO.m

Training functions and `WeightUpdateGlob.m`

There is not a single module called WeightUpdate.m, but a number of modules that act as training functions which update the weights. The selection among these is done by the user in ProbSpec.m by setting the variable trainType (see Section 4.1.1) and then in MainLoop.m the function is selected according to this value (see Section 4.1.5).

Though there is WeightUpdateGlob.m, which has the global variable definitions introduced and commonly used in the training functions. Here is a description of these global variables:

actOutsL1: Actual outputs of layer 1 (hidden layer), i.e. results of the activation function given the net input of the incoming connections.
actOutsL2: Actual outputs of layer 2 (output layer), i.e. results of the activation function given the net input of the incoming connections.
benefL2: Difference between the actual and desired outputs, i.e. the immediate error term.

Although not declared global, there are some common used variables in the training functions. Here is descriptions for the important ones:

netInputsL1 , netInputsL2: Net inputs of layer 1 (hidden layer) and layer 2 (output layer), respectively, i.e. sum of all inputs multiplied with their associated weight factors. This value is passed to the activation function of the unit to produce the actual output (see actOutsL1, actOutsL2 above).
derivsL1 , derivsL2: Numerical derivatives of activation function of the respective layer at the current net input values.
dawsonDelta: The generalized delta rule part of the marginal weight change for the output layer. Should be multiplied with the rate parameter (see rate in Section 4.1.1) and unit inputs to be added to the weights.
dawsonEpsilon: Additional term to the marginal weight change only to be used with value units on the output layer, introduced in [DawsonSchopflocher92]. Used the same way as dawsonDelta, as described above.
standardDeltaL2: Sum of dawsonDelta and dawsonEpsilon, described above. This quantity has the same properties with the parameters that it's composed of, therefore the raw marginal change on weight and bias parameters.
standardDeltaL1: Same as standardDeltaL2 described above, only for the first (hidden) layer. Therefore the error term has to be propagated by taking the error term of the output layer (standardDeltaL2) and multiply it with the weights connecting the output layer to the hidden layer.
deltaBiasL1 , deltaBiasL2: Marginal change on the bias parameters of the respective layer of the network. Calculated by multiplying the above mentioned raw marginal change standardDeltaL2 with the rate parameter and the inputs of the units to get the marginal change on the parameters of the network.¹⁰
deltaWtsL1 , deltaWtsL2: Marginal change on the bias parameters of the respective layer of the network. Calculated by multiplying the above mentioned marginal dawsonDelta and dawsonEpsilon values with the rate parameter and the inputs of the units.¹¹

The following subsections give descriptions of the related training functions and the activation functions also.

Next: batchMode.m and batchModeNewton.m Up: Base Modules Previous: TODO.m

Cengiz Gunay
2000-06-25