JRM Vol.20 No.5 pp. 757-774
doi: 10.20965/jrm.2008.p0757


Reinforcement Signal Propagation Algorithm for Logic Circuit

Chyon Hae Kim*, Tetsuya Ogata**, and Shigeki Sugano*

*Department of Mechanical Engineering, Waseda University, 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555, Japan

**Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, Sakyo-ku, Kyoto 606-8501, Japan

February 16, 2008
August 19, 2008
October 20, 2008
topology, self-organization, neural network, reinforcement learning, robot

This paper proposes a group of network elements, SONE, that self-organizes network topology, aiming at online and real-time learning and adaptation in robots. SONE, consisting of node elements and link elements, develops network topology by repeating generation and elimination of themselves based on reinforcement signals that are propagated and stored between the elements. This technique proved successful in simulations in which a mobile robot avoided obstacles, and it convinced us of its feasibility for online learning.

Cite this article as:
Chyon Hae Kim, Tetsuya Ogata, and Shigeki Sugano, “Reinforcement Signal Propagation Algorithm for Logic Circuit,” J. Robot. Mechatron., Vol.20, No.5, pp. 757-774, 2008.
Data files:
  1. [1] C. J. C. H. Watkins, “Learning From Delayed Rewards,” Ph.D. thesis of Cambridge University, 1989.
  2. [2] C. J. C. H. Watkins, “Q-learning,” Machine Learning, 8, pp. 279-292, 1992.
  3. [3] R. E. Bellman, “A Markov decision process,” Journal of Mathematical Mechanics, 6, pp. 221-229, 1957.
  4. [4] R. S. Sutton and A. G. Barto, “Reinforcement Learning,” MIT Press, 2000.
  5. [5] Y. Kobayashi and S. Hosoe, “Hyper-Cubic Discretization in Reinforcement Learning Based on Autonomous Decentralized Approach,” IEEE Int. Conf. on Systems Man & Cybernetics, pp. 3633-3638, 2003.
  6. [6] Y. Takahashi and M. Asada, “Multi-Controller Fusion in Multi-Layered Reinforcement Learning,” Int. Conf. on Multisensor Fusion and Integration for Intelligent Systems, pp. 7-12, 2001.
  7. [7] K. Shibata, Y. Okabe, and K. Ito, “Direct-Vision-Based Reinforcement Learning Using a Layered Neural Network –For the Whole Process from Sensors to Motors–,” Transaction of the society of Instrument and Control Engineers, 37, 2, pp. 168-177, 2001 (in Japanese).
  8. [8] K. O. Stanley and R. Miikkulainen, “Efficient Reinforcement Learning Through Evolving Neural Network Topologies,” Proc. of the Genetic and Evolutionary Computation Conf., 2002.
  9. [9] A. Utani, G. Kobayashi, Y. Yamazaki, and N. Tosaka, “Self-Designing Neural Network with Integrated Learning Algorithm for Structure and Weight Parameters,” Transaction of The Japan Society for Computational Engineering and Science, (20010043), 2001 (in Japanese).
  10. [10] K. O. Stanley, B. D. Bryant, and R. Miikkulainen, “Real-Time Neuroevolution in the NERO Video Game,” IEEE Transactions on Evolutionary Computation, 9, p. 653, 2005.
  11. [11] S. Whiteson and P. Stone, “Evolutionary Function Approximation for Reinforcement Learning,” Machine Learning, 7, pp. 877-917, 2006.
  12. [12] H. Kitano, “Neurogenttic learning: an integrated method of designing and training neural networks using genetic algorithms,” Physica D, 75, pp. 225-238, 1994.
  13. [13] M. Suzuki and D. Floreano, “Evolutionary Active Vision Toward Three Dimensional Landmark-Navigation,” Lecture note in computer science, 2005.
  14. [14] C. H. Kim, T. Ogata, and S. Sugano, “An Algorithm for Self-Organizing Logic Circuit based on Local Rules,” Transaction of the Society of Instrument and Control Engneers, Vol.42, No.4, pp. 334-341, 2006 (in Japanese).
  15. [15] D. B. Parker, “Learning Logic,” Office of Technology Licensing in Stanford University, 1982.
  16. [16] E. D. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning Representations by Back-Propagating Errors,” Nature, pp. 323-533, 1986.

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Mar. 05, 2021