Studies on Cardinality of Solutions for Multilayer Nets and a Scaling Method in Hardware Implementations
Hiromu Gotanda*, Hiroshi Shiratsuchi,* Katsuhiro Inoue** and Kousuke Kumamaru**
*Faculty of Engineering, Kinki University in Kyushu
**Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology
This paper shows that multilayer nets of equal structure allow the same cardinality of admissible solutions for learning tasks whose input patterns are related by affine transform, even if their sigmoid functions are different in polarity and range. This result can be applied to a scaling problem arising in building nets in analog hardware. In input patterns and sigmoid functions multiplied by scaling factor k, separation and generalization can be preserved if weights are set to 1/k times the original while keeping bias values intact. With such initial weights and biases as above, the converging behavior of back propagation (BP) learning in the scaled environment becomes equivalent to that in the original environment, provided that learning coefficients are multiplied by 1/k2 for bias and by 1/k4 for weight updates. In BP learning in an ordinary way with both weights and biases initialized by uniform random numbers of identical distribution and with the learning coefficient adopted for both weight and bias update, it is shown by simulation that initial values resulting in good convergence decease with increasing k.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 International License.