Path Searching of Robot Manipulator Using Reinforcement Learning-Reduction of Searched Configuration Space Using SOM and Multistage Learning-

Seiji Aoyagi; Kenji Hiraoka

doi:10.20965/jrm.2010.p0532

single-rb.php

« previous

JRM Vol.22 No.4 pp. 532-541

doi: 10.20965/jrm.2010.p0532

(2010)

Paper:

Views over last 60 days: 547

Path Searching of Robot Manipulator Using Reinforcement Learning-Reduction of Searched Configuration Space Using SOM and Multistage Learning-

Seiji Aoyagi and Kenji Hiraoka

Department of Mechanical Engineering, Faculty of Engineering Science, Kansai University, 3-3-35 Yamate-cho, Suita, Osaka 564-8680, Japan

Received:

December 22, 2009

Accepted:

April 16, 2010

Published:

August 20, 2010

Keywords:

robot manipulator, path search, reinforcement learning, SOM, DP matching

Abstract

Reinforcement learning is applicable to a robot manipulator required to search for a path adaptable to an unknown environment. Searching for an optimal path in configuration space (C-space), i.e., joint angle space, however, takes much convergence time and memory resources. We propose two ways to overcome this problem. One is restructuring C-space by using Self-Organizing Maps (SOM). Another is doing reinforcement learning at multistage, stage 1 of which searches a path in C-space without considering obstacles, so does stage 2 with considering them near path 1, reducing searched space and convergence time. We propose further reducing searched space by adjusting the path in stage 2 to that in stage 1 through dynamic programming (DP) matching.

Cite this article as:

S. Aoyagi and K. Hiraoka, “Path Searching of Robot Manipulator Using Reinforcement Learning-Reduction of Searched Configuration Space Using SOM and Multistage Learning-,” J. Robot. Mechatron., Vol.22 No.4, pp. 532-541, 2010.

Data files:

References

[1] R. S. Sutton and A. G. Barto (translated by S. Mikami and M. Minagawa), “Reinforcement Learning,” Morikita Publishing Co., Ltd., Tokyo, 2000.
[2] E. W. Dijkstra, “A Note on Two Problems in Connexion with Graphs,” Numerische Mathematik, Vol.1, No.1, pp. 269-271, 1959.
[3] E. Hart, N. J. Nilsson, and B. Raphael, “A Formal Basis for the Heuristic Determination of Minimum Cost Paths,” IEEE Transactions on Systems Science and Cybernetics, Vol.4, No.2, pp. 100-108, 1968.
[4] J. C. Latombe, “Robot Motion Planning,” Kluwer Academic Publisher, Norwell, MA, 1991.
[5] J. Kuffner and S. M. Lavalle, “RRT-Connect: An Efficient Approach to Single-Query Path Planning,” in Proc. IEEE Int. Conf. Robotics and Automation (ICRA 2000), pp. 995-1001, 2000.
[6] A. Wolf, H. B. Brown, R. Casciola, A. Costa, M. Shwerin, E. Shamas, and H. Choset, “A Mobile Hyper Redundant Mechanism for Search and Rescue Tasks,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems (IROS 2003), pp. 2889-2895, 2003.
[7] K. Osuka, “Search in Narrow Space by Snake-Like Robots –Introduction of Research in DDT: Snake-Like Robot Group–,” J. of the Robotics Society of Japan, Vol.22, No.5, pp. 554-557, 2004.
[8] S. Aoyagi, K. Tashiro, M. Minami, and M. Takano, “Development of Redundant Robot Simulator for Avoiding Arbitrary Obstacles Based on Semi-Analytical Method of Solving Inverse Kinematics,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems (IROS 2007), pp. 3497-3502, 2007.
[9] H. Gomi and M. Kawato, “Equilibrium-Point Control Hypothesis Examined by Measured Arm Stiffness During Multijoint Movement,” Science, Vol.272, No.5, pp. 117-120, 1996.
[10] J. Morimoto and K. Doya, “Acquisition of Stand-up Behavior by a 3-link-2-joint Robot using Hierarchical Reinforcement Learning,” J. of the Robotics Society of Japan, Vol.19, No.5, pp. 574-579, 2001.
[11] M. Nunobiki, K. Okuda, and S. Maeda, “Reinforcement Learning of Multi-Link Robot with Fuzzy ART Neural Networks for State-Space Segmentation,” J. of the Japan Society for Precision Engineering, Vol.71, No.1, pp. 141-145, 2005.
[12] H. Kimura, T. Yamashita, and S. Kobayashi, “Reinforcement Learning of Walking Behavior for a Four-legged robot,” Trans. of the Institute of Electrical Engineers of Japan, Vol.122-C, No.3, pp. 330-337, 2002.
[13] K. Ito, F. Matsuno, and A. Gofuku, “A Study of Reinforcement Learning for Redundant Robots –New Framework of Reinforcement Learning that Utilizes Body Image–,” J. of the Robotics Society of Japan, Vol.22, No.5, pp. 672-689, 2004.
[14] H. Iwasaki and N. Sueda, “A System of Autonomous State Space Construction with the Self-Organizing Map in Reinforcement Learning,” in Proc. The 19th Annual Conf. of the Japanese Society for Artificial Intelligence, pp. 1-4, 2005.
[15] H. Tokutaka and K. Fujimura, “Application of Self-Organizing Maps,” Kaibundo Publishing Co., Ltd., Tokyo, 1999.
[16] K. Hiraoka, M. Suzuki, and S. Aoyagi, “Path Search of Redundant Manipulator for Obstacle Avoidance by Reinforcement Learning – Reduction of C-Space by Two Stage Learning Using Reference Trajectories and DP Matching,” in Proc. 26th Annual Conf. of the Robotics Society of Japan, CD-ROM no.1N3-08, 2008.
[17] D. Baba, S. Uchida, and H. Sakoe, “A Predictive DP Matching Algorithm for On-line Character Recognition,” in Proc. Meeting on Image Recognition and Understanding (MIRU2007), pp. 1159-1164, 2007.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] R. S. Sutton and A. G. Barto (translated by S. Mikami and M. Minagawa), “Reinforcement Learning,” Morikita Publishing Co., Ltd., Tokyo, 2000.

[2] [2] E. W. Dijkstra, “A Note on Two Problems in Connexion with Graphs,” Numerische Mathematik, Vol.1, No.1, pp. 269-271, 1959.

[3] [3] E. Hart, N. J. Nilsson, and B. Raphael, “A Formal Basis for the Heuristic Determination of Minimum Cost Paths,” IEEE Transactions on Systems Science and Cybernetics, Vol.4, No.2, pp. 100-108, 1968.

[4] [4] J. C. Latombe, “Robot Motion Planning,” Kluwer Academic Publisher, Norwell, MA, 1991.

[5] [5] J. Kuffner and S. M. Lavalle, “RRT-Connect: An Efficient Approach to Single-Query Path Planning,” in Proc. IEEE Int. Conf. Robotics and Automation (ICRA 2000), pp. 995-1001, 2000.

[6] [6] A. Wolf, H. B. Brown, R. Casciola, A. Costa, M. Shwerin, E. Shamas, and H. Choset, “A Mobile Hyper Redundant Mechanism for Search and Rescue Tasks,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems (IROS 2003), pp. 2889-2895, 2003.

[7] [7] K. Osuka, “Search in Narrow Space by Snake-Like Robots –Introduction of Research in DDT: Snake-Like Robot Group–,” J. of the Robotics Society of Japan, Vol.22, No.5, pp. 554-557, 2004.

[8] [8] S. Aoyagi, K. Tashiro, M. Minami, and M. Takano, “Development of Redundant Robot Simulator for Avoiding Arbitrary Obstacles Based on Semi-Analytical Method of Solving Inverse Kinematics,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems (IROS 2007), pp. 3497-3502, 2007.

[9] [9] H. Gomi and M. Kawato, “Equilibrium-Point Control Hypothesis Examined by Measured Arm Stiffness During Multijoint Movement,” Science, Vol.272, No.5, pp. 117-120, 1996.

[10] [10] J. Morimoto and K. Doya, “Acquisition of Stand-up Behavior by a 3-link-2-joint Robot using Hierarchical Reinforcement Learning,” J. of the Robotics Society of Japan, Vol.19, No.5, pp. 574-579, 2001.

[11] [11] M. Nunobiki, K. Okuda, and S. Maeda, “Reinforcement Learning of Multi-Link Robot with Fuzzy ART Neural Networks for State-Space Segmentation,” J. of the Japan Society for Precision Engineering, Vol.71, No.1, pp. 141-145, 2005.

[12] [12] H. Kimura, T. Yamashita, and S. Kobayashi, “Reinforcement Learning of Walking Behavior for a Four-legged robot,” Trans. of the Institute of Electrical Engineers of Japan, Vol.122-C, No.3, pp. 330-337, 2002.

[13] [13] K. Ito, F. Matsuno, and A. Gofuku, “A Study of Reinforcement Learning for Redundant Robots –New Framework of Reinforcement Learning that Utilizes Body Image–,” J. of the Robotics Society of Japan, Vol.22, No.5, pp. 672-689, 2004.

[14] [14] H. Iwasaki and N. Sueda, “A System of Autonomous State Space Construction with the Self-Organizing Map in Reinforcement Learning,” in Proc. The 19th Annual Conf. of the Japanese Society for Artificial Intelligence, pp. 1-4, 2005.

[15] [15] H. Tokutaka and K. Fujimura, “Application of Self-Organizing Maps,” Kaibundo Publishing Co., Ltd., Tokyo, 1999.

[16] [16] K. Hiraoka, M. Suzuki, and S. Aoyagi, “Path Search of Redundant Manipulator for Obstacle Avoidance by Reinforcement Learning – Reduction of C-Space by Two Stage Learning Using Reference Trajectories and DP Matching,” in Proc. 26th Annual Conf. of the Robotics Society of Japan, CD-ROM no.1N3-08, 2008.

[17] [17] D. Baba, S. Uchida, and H. Sakoe, “A Predictive DP Matching Algorithm for On-line Character Recognition,” in Proc. Meeting on Image Recognition and Understanding (MIRU2007), pp. 1159-1164, 2007.