Reinforcement Signal Propagation Algorithm for Logic Circuit
Chyon Hae Kim*, Tetsuya Ogata**, and Shigeki Sugano*
*Department of Mechanical Engineering, Waseda University, 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
**Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, Sakyo-ku, Kyoto 606-8501, Japan
This paper proposes a group of network elements, SONE, that self-organizes network topology, aiming at online and real-time learning and adaptation in robots. SONE, consisting of node elements and link elements, develops network topology by repeating generation and elimination of themselves based on reinforcement signals that are propagated and stored between the elements. This technique proved successful in simulations in which a mobile robot avoided obstacles, and it convinced us of its feasibility for online learning.
-  C. J. C. H. Watkins, “Learning From Delayed Rewards,” Ph.D. thesis of Cambridge University, 1989.
-  C. J. C. H. Watkins, “Q-learning,” Machine Learning, 8, pp. 279-292, 1992.
-  R. E. Bellman, “A Markov decision process,” Journal of Mathematical Mechanics, 6, pp. 221-229, 1957.
-  R. S. Sutton and A. G. Barto, “Reinforcement Learning,” MIT Press, 2000.
-  Y. Kobayashi and S. Hosoe, “Hyper-Cubic Discretization in Reinforcement Learning Based on Autonomous Decentralized Approach,” IEEE Int. Conf. on Systems Man & Cybernetics, pp. 3633-3638, 2003.
-  Y. Takahashi and M. Asada, “Multi-Controller Fusion in Multi-Layered Reinforcement Learning,” Int. Conf. on Multisensor Fusion and Integration for Intelligent Systems, pp. 7-12, 2001.
-  K. Shibata, Y. Okabe, and K. Ito, “Direct-Vision-Based Reinforcement Learning Using a Layered Neural Network –For the Whole Process from Sensors to Motors–,” Transaction of the society of Instrument and Control Engineers, 37, 2, pp. 168-177, 2001 (in Japanese).
-  K. O. Stanley and R. Miikkulainen, “Efficient Reinforcement Learning Through Evolving Neural Network Topologies,” Proc. of the Genetic and Evolutionary Computation Conf., 2002.
-  A. Utani, G. Kobayashi, Y. Yamazaki, and N. Tosaka, “Self-Designing Neural Network with Integrated Learning Algorithm for Structure and Weight Parameters,” Transaction of The Japan Society for Computational Engineering and Science, (20010043), 2001 (in Japanese).
-  K. O. Stanley, B. D. Bryant, and R. Miikkulainen, “Real-Time Neuroevolution in the NERO Video Game,” IEEE Transactions on Evolutionary Computation, 9, p. 653, 2005.
-  S. Whiteson and P. Stone, “Evolutionary Function Approximation for Reinforcement Learning,” Machine Learning, 7, pp. 877-917, 2006.
-  H. Kitano, “Neurogenttic learning: an integrated method of designing and training neural networks using genetic algorithms,” Physica D, 75, pp. 225-238, 1994.
-  M. Suzuki and D. Floreano, “Evolutionary Active Vision Toward Three Dimensional Landmark-Navigation,” Lecture note in computer science, 2005.
-  C. H. Kim, T. Ogata, and S. Sugano, “An Algorithm for Self-Organizing Logic Circuit based on Local Rules,” Transaction of the Society of Instrument and Control Engneers, Vol.42, No.4, pp. 334-341, 2006 (in Japanese).
-  D. B. Parker, “Learning Logic,” Office of Technology Licensing in Stanford University, 1982.
-  E. D. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning Representations by Back-Propagating Errors,” Nature, pp. 323-533, 1986.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.
Copyright© 2008 by Fuji Technology Press Ltd. and Japan Society of Mechanical Engineers. All right reserved.