Paper:
Preservation and Application of Acquired Knowledge Using Instance-Based Reinforcement Learning for Multi-Robot Systems
Junki Sakanoue, Toshiyuki Yasuda, and Kazuhiro Ohkura
Graduate School of Engineering, Hiroshima University, 1-4-1 Kagamiyama, Higashi-Hiroshima, Hiroshima 739-8527, Japan
We have been developing a reinforcement learning technique called BRL as an approach to autonomous specialization, which is a new concept in cooperative multi-robot systems. BRL has a mechanism for autonomously segmenting the continuous state and action space. However, as in other machine learning approaches, overfitting is occasionally observed after successful learning. This paper proposes a technique to sophisticatedly utilize messy knowledge acquired using BRL. The proposed technique is expected to show better robustness against environmental changes. We investigate the proposed technique by conducting computer simulations of a cooperative carrying task.
- [1] R. S. Sutton and A. G. Barto, “Reinforcement Learning: An Introduction,” MIT Press, 1998.
- [2] M. Tan, “Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents,” Proc. of the 10th Int. Conf. on Machine Learning, 1993 MIT Press, pp. 330-337, 1998.
- [3] M. Asada, E. Uchibe, S. Noda, S. Tawaratsumida, and K. Hosoda, “Coordination Of Multiple Behaviors Acquired By A Vision-Based Reinforcement Learning,” Proc. of IEEE/RSJ/GI Int. Conf. on Intelligent Robots and Systems, Vol.2, pp. 917-924, 1994.
- [4] T. Yasuda and K. Ohkura, “Improving Search Efficiency in the Action Space of an Instance-Based Reinforcement Learning,” Advances in Artifical Life, the 9th European Conf. on Artificial Life, LNAI, Vol.4648, pp. 325-334, 2007.
- [5] T. Yasuda, S. Sakimoto, and K. Ohkura, “Analyzing the Process of Autonomous Specialization in Reinforcement Learning Robots,” Proc. of the 3rd Int. Symp. on Mobiligence, pp. 143-148, 2009.
- [6] C. Cortes and V. Vapnik, “Support vector networks,” Machine Learning, Vol.20, pp. 273-297, 1995.
- [7] K. Doya, “Reinforcement Learning in Continuous Time and Space,” Neural Computation, Vol.12, pp. 219-245, 2000.
- [8] R. J. Williams, “Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning,” Machine Learning, Vol.8, pp. 229-256, 1992.
- [9] C. Chang and C. Lin, “LIBSVM, a library for support vector machines,” 2001.
Software at http://www.csie.ntu.edu.tw/cjlin/libsvm/
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.