Improving the Robustness of Instance-Based Reinforcement Learning Robots by Metalearning
Toshiyuki Yasuda, Kousuke Araki, and Kazuhiro Ohkura
Graduate School of Engineering, Hiroshima University, 1-4-1, Kagamiyama, Higashi-Hiroshima, Hiroshima 739-8527, Japan
Learning autonomous robots have been widely discussed in recent years. Reinforcement learning (RL) is a popular method in this domain. However, its performance is quite sensitive to the segmentation of state and action spaces. To overcome this problem, we developed the new technique Bayesian-discriminationfunction-based RL (BRL). BRL has proven to be more effective than other standard RL algorithms in dealing withmulti-robot system(MRS) problems. However, as in most learning systems, occasional overfitting problems occur in BRL. This paper introduces an extended BRL for improving the robustness of MRSs. Metalearning based on the information entropy of fired rules is adopted for adaptive modification of its learning parameters. Computer simulations are conducted to verify the effectiveness of our proposed method.
-  R.S. Sutton and A.G. Barto, “Reinforcement Learning: An Introduction,” MIT Press, 1998.
-  R.S. Sutton, “Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding,” Advances in Neural Information Processing Systems, Vol. 8, pp. 1038-1044, MIT Press, 1996.
-  J. Morimoto and K. Doya, “Acquisition of Stand-Up Behavior by a Real Robot using Hierarchical Reinforcement Learning for Motion Learning: Learning, “Stand Up” Trajectories,” Proc. of Intl. Conf. on Machine Learning, pp. 623-630, 2000.
-  L.J. Lin, “Scaling Up Reinforcement Learning for Robot Control,” Proc. of the 10th Intl Conf. on Machine Learning, pp. 182-189, 1993.
-  M. Asada, S. Noda, and K. Hosoda, “Action-Based Sensor Space Categorization for Robot Learning,” Proc. of IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems, pp. 1502-1509, 1996.
-  Y. Takahashi, M. Asada, and K. Hosoda, “Reasonable Performance in Less Learning Time by Real Robot Based on Incremental State Space Segmentation,” Proc. of IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems, pp. 1518-1524, 1996.
-  M. Svinin, F. Kojima, Y. Katada, and K. Ueda, “Initial Experiments on Reinforcement Learning Control of Cooperative Manipulations,” Proc. of IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems, pp. 416-422, 2000.
-  T. Yasuda and K. Ohkura, “Autonomous Role Assignment in Homogeneous Multi-Robot Systems,” Journal of Robotics and Mechatronics, Vol. 17, No. 5, pp. 596-604, 2005.
-  T. Yasuda and K. Ohkura, “Improving Robustness of Reinforcement Learning for a Multi-Robot System Environment,” Proc. of the Fourth IEEE Intl. Workshop on Soft Computing as Transdisciplinary Science and Technology, pp. 265- 272, 2005.
-  T. Yasuda and K. Ohkura, “Improving Search Efficiency in the Action Space of an Instance-Based Reinforcement Learning,” Advances in Artifical Life, the 9th European Conf. on Artificial Life, LNAI, Vol. 4648, pp. 325-334, 2007.
-  K. Ohkura and R. Washizaki, “Robust Instance-Based Reinforcement Learning for Multi-Robot Systems,” Proc. of the 4th Intl. Conf. on Advanced Mechatronics, pp. 583-588, 2004.
-  K. Doya, “Reinforcement Learning in Continuous Time and Space,” Neural Computation, Vol. 12, pp. 219-245, 2000.
-  J. Peters and S. Schaal, “Natural actor critic,” Neurocomputing, Vol.71, 7-9, pp. 1180-1190, 2008.
-  R.J. Williams, “Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning,” Machine Learning, Vol. 8, pp. 229-256, 1992.
-  K. Doya, “Metalearning and neuromodulation,” Neural Networks, Vol. 15, Issues 4-6, pp. 495-506, 2002.
-  N. Schweighofer and K. Doya, “Meta-learning in Reinforcement Learning,” Neural Networks, Vol. 16, Issue 1, pp. 5-9, 2003.
-  S. Elfwing, E. Uchibe, K. Doya, and H.I. Chiristensen, “Coevolution of Shaping Rewards and Meta-Parameters in Reinforcement Learning,” Adaptive Behavior, Vol. 16, pp. 400-412, 2008.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.