Hardware and Numerical Experiments of Autonomous Robust Skill Generation Using Reinforcement Learning

Kei Senda; Takayuki Kondo; Yoshimitsu Iwasaki; Shinji Fujii; Naofumi Fujiwara; Naoki Suganuma

doi:10.20965/jrm.2008.p0350

single-rb.php

« previous

JRM Vol.20 No.3 pp. 350-357

doi: 10.20965/jrm.2008.p0350

(2008)

Paper:

Views over last 60 days: 490

Hardware and Numerical Experiments of Autonomous Robust Skill Generation Using Reinforcement Learning

Kei Senda, Takayuki Kondo, Yoshimitsu Iwasaki, Shinji Fujii,
Naofumi Fujiwara, and Naoki Suganuma

Graduate School of Natural Science and Technology, Kanazawa University, Kakuma-machi, Kanazawa, Ishikawa 920-1192, Japan

Received:

September 29, 2007

Accepted:

February 5, 2008

Published:

June 20, 2008

Keywords:

skill generation, reinforcement learning, space robot, robustness, autonomy

Abstract

It is difficult for robots to achieve tasks contacting environment due to error between the controller models and the real environment. To solve this problem, we propose having a robot autonomously obtains proficient robust skills against model error. Numerical simulation and experiments using an autonomous space robot demonstrate the feasibility of our proposal in the real environment.

Cite this article as:

K. Senda, T. Kondo, Y. Iwasaki, S. Fujii, N. Fujiwara, and N. Suganuma, “Hardware and Numerical Experiments of Autonomous Robust Skill Generation Using Reinforcement Learning,” J. Robot. Mechatron., Vol.20 No.3, pp. 350-357, 2008.

Data files:

References

[1] K. Senda, “An Approach to Autonomous Space Robots,” Systems, Control and Information, Vol.45, No.10, pp. 593-599, 2001 (in Japanese).
[2] R. S. Sutton and A. G. Barto, “Reinforcement learning: An introduction,” MIT Press, Cambridge, MA, 1998.
[3] D. P. Bertsekas and J. N. Tsitsiklis, “Neuro-Dynamic Programming,” Athena Scientific, 1996.
[4] S. Fujii, K. Senda, and S. Mano, “Acceleration of Reinforcement Learning by Estimating State Transition Probability Model,” Trans. Society of Instrument and Control Engineers, Vol.42, No.1, pp. 47-53, 2006 (in Japanese).
[5] K. Senda, Y. Murotsu, A. Mitsuya, H. Adachi, S. Ito, J. Shitakubo, and T. Matsumoto, “Hardware Experiments of A Truss Assembly by An Autonomous Space Learning Robot,” AIAA J. Spacecraft and Rockets, Vol.39, No.2, pp. 267-273, 2002.
[6] K. Senda, Y. Murotsu, A. Mitsuya, H. Adachi, S. Ito, and J. Shitakubo, “Hardware Experiments of Autonomous Space Robot,” J. of Robotics and Mechatronics, Vol.12, No.4, pp. 343-350, 2000.
[7] S. B. Skaar and C. F. Ruoff (Eds.), “Teleoperation and Robotics in Space,” AIAA, Washington, DC, 1995.
[8] M. Asada, “Issues in Applying Robot Learning and Evolutionary Methods to Real Environments,” J. Society of Instrument and Control Engineers, Vol.38, No.10, pp. 650-653, 1999 (in Japanese).
[9] F. Miyazaki and S. Arimoto, “Sensory Feedback for Robot Manipulators,” Journal of Robotic Systems, Vol.2, No.1, pp. 53-71, 1985.
[10] D. E. Whitney, “Quasi-Static Assembly of Compliantly Supported Rigid Parts,” Trans. ASME J. Dynamic Systems, Measurement, and Control, Vol.104, pp. 65-77, 1982.
[11] D. Sato and M. Uchiyama, “Peg-in-Hole Task by a Robot,” Journal of the Japan Society of Mechanical Engineers, Vol.110, No.1066, pp. 678-679, 2007 (in Japanese).
[12] N. Yamanobe, Y. Maeda, T. Arai, A. Watanabe, T. Kato, T. Sato, and K. Hatanaka, “Design of Force Control Parameters Considering Cycle Time,” Journal of the Robotics Society of Japan, Vol.24, No.4, pp. 554-562, 2006 (in Japanese).
[13] T. Fukuda, W. Srituravanich, T. Ueyama, and Y. Hasegawa, “A Study on Skill Acquisition based on Environment Information (Task Path Planning for Assembly Task Considering Uncertainty),” Transactions of the Japan Society of Mechanical Engineers. C, Vol.66, No.645, pp. 1597-1604, 2000 (in Japanese).
[14] T. Başar and P. Bernhard, “H_∞-Optimal Control and Related Minimax Design Problems,” Birkhäuser, Boston, 1995.
[15] J. Morimoto and K. Doya, “Robust Reinforcement Learning,” Neural Computation, Vol.17, No.2, pp. 335-359, 2005.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] K. Senda, “An Approach to Autonomous Space Robots,” Systems, Control and Information, Vol.45, No.10, pp. 593-599, 2001 (in Japanese).

[2] [2] R. S. Sutton and A. G. Barto, “Reinforcement learning: An introduction,” MIT Press, Cambridge, MA, 1998.

[3] [3] D. P. Bertsekas and J. N. Tsitsiklis, “Neuro-Dynamic Programming,” Athena Scientific, 1996.

[4] [4] S. Fujii, K. Senda, and S. Mano, “Acceleration of Reinforcement Learning by Estimating State Transition Probability Model,” Trans. Society of Instrument and Control Engineers, Vol.42, No.1, pp. 47-53, 2006 (in Japanese).

[5] [5] K. Senda, Y. Murotsu, A. Mitsuya, H. Adachi, S. Ito, J. Shitakubo, and T. Matsumoto, “Hardware Experiments of A Truss Assembly by An Autonomous Space Learning Robot,” AIAA J. Spacecraft and Rockets, Vol.39, No.2, pp. 267-273, 2002.

[6] [6] K. Senda, Y. Murotsu, A. Mitsuya, H. Adachi, S. Ito, and J. Shitakubo, “Hardware Experiments of Autonomous Space Robot,” J. of Robotics and Mechatronics, Vol.12, No.4, pp. 343-350, 2000.

[7] [7] S. B. Skaar and C. F. Ruoff (Eds.), “Teleoperation and Robotics in Space,” AIAA, Washington, DC, 1995.

[8] [8] M. Asada, “Issues in Applying Robot Learning and Evolutionary Methods to Real Environments,” J. Society of Instrument and Control Engineers, Vol.38, No.10, pp. 650-653, 1999 (in Japanese).

[9] [9] F. Miyazaki and S. Arimoto, “Sensory Feedback for Robot Manipulators,” Journal of Robotic Systems, Vol.2, No.1, pp. 53-71, 1985.

[10] [10] D. E. Whitney, “Quasi-Static Assembly of Compliantly Supported Rigid Parts,” Trans. ASME J. Dynamic Systems, Measurement, and Control, Vol.104, pp. 65-77, 1982.

[11] [11] D. Sato and M. Uchiyama, “Peg-in-Hole Task by a Robot,” Journal of the Japan Society of Mechanical Engineers, Vol.110, No.1066, pp. 678-679, 2007 (in Japanese).

[12] [12] N. Yamanobe, Y. Maeda, T. Arai, A. Watanabe, T. Kato, T. Sato, and K. Hatanaka, “Design of Force Control Parameters Considering Cycle Time,” Journal of the Robotics Society of Japan, Vol.24, No.4, pp. 554-562, 2006 (in Japanese).

[13] [13] T. Fukuda, W. Srituravanich, T. Ueyama, and Y. Hasegawa, “A Study on Skill Acquisition based on Environment Information (Task Path Planning for Assembly Task Considering Uncertainty),” Transactions of the Japan Society of Mechanical Engineers. C, Vol.66, No.645, pp. 1597-1604, 2000 (in Japanese).

[14] [14] T. Başar and P. Bernhard, “H_∞-Optimal Control and Related Minimax Design Problems,” Birkhäuser, Boston, 1995.

[15] [15] J. Morimoto and K. Doya, “Robust Reinforcement Learning,” Neural Computation, Vol.17, No.2, pp. 335-359, 2005.

Hardware and Numerical Experiments of Autonomous Robust Skill Generation Using Reinforcement Learning

Kei Senda, Takayuki Kondo, Yoshimitsu Iwasaki, Shinji Fujii, Naofumi Fujiwara, and Naoki Suganuma

Kei Senda, Takayuki Kondo, Yoshimitsu Iwasaki, Shinji Fujii,
Naofumi Fujiwara, and Naoki Suganuma