Learning of Whole Arm Manipulation with Constraint of Contact Mode Maintaining

Nobuyuki Kawarai; Yuichi Kobayashi

doi:10.20965/jrm.2010.p0542

single-rb.php

« previous

JRM Vol.22 No.4 pp. 542-550

doi: 10.20965/jrm.2010.p0542

(2010)

Paper:

Views over last 60 days: 603

Learning of Whole Arm Manipulation with Constraint of Contact Mode Maintaining

Nobuyuki Kawarai and Yuichi Kobayashi

Tokyo University of Agriculture and Technology, 2-14-16 Naka-cho, Koganei, Tokyo 184-8588, Japan

Received:

December 21, 2009

Accepted:

April 21, 2010

Published:

August 20, 2010

Keywords:

reinforcement learning, manipulation, contact mode estimation

Abstract

This paper proposes the learning of whole arm manipulation with a two-link manipulator. Our proposal combines a controller obtained by reinforcement learning (actor-critic) and a learning classifier realized by a Support Vector Machine (SVM). The classifier learns the boundary between slip and stick modes in torque space. Using the result of classification, the robot learns to move the object toward desired position while keeping the desired contact modes. Control input (torque) is first specified by the actor. The SVM classifier judges whether torque can maintain the desired slip or stick mode and, if not, it modifies the torque so that the desired mode is maintained. It was verified in the simulation that our proposed learning realized accelerating of the object and decelerating it while keeping the desired mode, i.e., avoiding undesired slipping of the object.

Cite this article as:

N. Kawarai and Y. Kobayashi, “Learning of Whole Arm Manipulation with Constraint of Contact Mode Maintaining,” J. Robot. Mechatron., Vol.22 No.4, pp. 542-550, 2010.

Data files:

References

[1] M. J. Cherif and K. K. Gupta, “Planning quasi-static fingertip manipulations for reconfiguring objects,” IEEE Trans. on Robotics and Automation, Vol.15, No.5, pp. 837-848, 1999.
[2] M. Yashima, Y. Shiina, and H. Yamaguchi, “Randomized Manipulation Planning for A Multi-Fingered Hand by Switching Contact Modes,” Proc. of IEEE Int. Conf. on Robotics and Automation, 2003.
[3] E. Yoshida, P. Blazevic, V. Hugel, K. Yokoi, and K. Harada, “Pivoting a Large Object: Whole-body Manipulation by a Humanoid Robot,” Applied Bionics and Biomechanics, Vol.3, No.3, pp. 227-235, 2006.
[4] J. Nakanishi, J. Morimoto, G. Endo, G. Cheng, S. Schaal, and M. Kawato, “Learning from demonstration and adaptation of biped locomotion,” Robotics and Autonomous Systems, Vol.47, No.2-3, pp. 79-91, 2004.
[5] R. Sutton and A. Barto, “Reinforcement Learning,” MIT Press, 1998.
[6] H. Kimura, T. Yamashita, and S. Kobayashi, “Reinforcement Learning of Walking Behavior for a Four-Legged Robot,” Proc. of IEEE Conf. on Decision and Control, pp. 411-416, 2001.
[7] J. Morimoto and K. Doya, “Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning,” Robotics and Autonomous Systems, Vol.36, No.1, pp. 37-51, 2001.
[8] H. Miyamoto, J. Morimoto, K. Doya, and M. Kawato, “Reinforcement learning with via-point representation,” Neural Networks, Vol.17, No.3, pp. 299-305, 2004.
[9] Y. Kobayashi, M. Shibata, S. Hosoe, and Y. Uno, “Learning of object manipulation with stick/slip mode switching,” Proc. of Int. Conf. on Intelligent Robots and Systems, pp. 373-379, 2008.
[10] T. Odashima, et al., “A Soft Human-Interactive Robot RI-MAN,” Video Proc. of IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, 2006.
[11] S. Nakaoka, S. Hattori, F. Kanehiro, S. Kajita, and H. Hirukawa, “Constraint-based Dynamics Simulator for Humanoid Robots with Shock Absorbing Mechanisms,” The 2007 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, 2007.
[12] A. Schaft and H. Schumacher, “An Introduction to Hybrid Dynamical Systems,” Springer, 2000.
[13] V. N. Vapnik, “The Nature of Statistical Learning Theory,” Springer,1995.
[14] O. L. Mangasarian and D. R. Musicant, “Lagrangian Support Vector Machines,” J. of Machine Learning Research, Vol.1, pp. 161-177, 2001.
[15] S. Bhatnagar, R. Sutton, M. Ghavamzadeh, and M. Lee, “Natural Actor-Critic Algorithms,” Automatica, Vol.45, No.11, pp. 2471-2482, 2009.
[16] D. W. Scott and S. R. Sain, “Multi-Dimensional Density Estimation,” Handbook of Statistics – Vol 23: Data Mining and Computational Statistics, 2004.
[17] T. Schlegl, M. Buss, and G. Schmidt, “Hybrid Control of Multi-fingered Dexterous Robotic Hands,” S. Engell, G. Frehse, E. Schnieder (Eds.): Modelling, Analysis and Design of Hybrid Systems, LNCIS, Vol.279, pp. 437-465, 2002.
[18] Y. Yin, S. Hosoe, and Z. Luo, “A Mixed Logic Dynamical Modeling Formulation and Optimal Control of Intelligent Robots,” Optimization Engineering, Vol.8, pp. 321-340, 2007.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] M. J. Cherif and K. K. Gupta, “Planning quasi-static fingertip manipulations for reconfiguring objects,” IEEE Trans. on Robotics and Automation, Vol.15, No.5, pp. 837-848, 1999.

[2] [2] M. Yashima, Y. Shiina, and H. Yamaguchi, “Randomized Manipulation Planning for A Multi-Fingered Hand by Switching Contact Modes,” Proc. of IEEE Int. Conf. on Robotics and Automation, 2003.

[3] [3] E. Yoshida, P. Blazevic, V. Hugel, K. Yokoi, and K. Harada, “Pivoting a Large Object: Whole-body Manipulation by a Humanoid Robot,” Applied Bionics and Biomechanics, Vol.3, No.3, pp. 227-235, 2006.

[4] [4] J. Nakanishi, J. Morimoto, G. Endo, G. Cheng, S. Schaal, and M. Kawato, “Learning from demonstration and adaptation of biped locomotion,” Robotics and Autonomous Systems, Vol.47, No.2-3, pp. 79-91, 2004.

[5] [5] R. Sutton and A. Barto, “Reinforcement Learning,” MIT Press, 1998.

[6] [6] H. Kimura, T. Yamashita, and S. Kobayashi, “Reinforcement Learning of Walking Behavior for a Four-Legged Robot,” Proc. of IEEE Conf. on Decision and Control, pp. 411-416, 2001.

[7] [7] J. Morimoto and K. Doya, “Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning,” Robotics and Autonomous Systems, Vol.36, No.1, pp. 37-51, 2001.

[8] [8] H. Miyamoto, J. Morimoto, K. Doya, and M. Kawato, “Reinforcement learning with via-point representation,” Neural Networks, Vol.17, No.3, pp. 299-305, 2004.

[9] [9] Y. Kobayashi, M. Shibata, S. Hosoe, and Y. Uno, “Learning of object manipulation with stick/slip mode switching,” Proc. of Int. Conf. on Intelligent Robots and Systems, pp. 373-379, 2008.

[10] [10] T. Odashima, et al., “A Soft Human-Interactive Robot RI-MAN,” Video Proc. of IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, 2006.

[11] [11] S. Nakaoka, S. Hattori, F. Kanehiro, S. Kajita, and H. Hirukawa, “Constraint-based Dynamics Simulator for Humanoid Robots with Shock Absorbing Mechanisms,” The 2007 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, 2007.

[12] [12] A. Schaft and H. Schumacher, “An Introduction to Hybrid Dynamical Systems,” Springer, 2000.

[13] [13] V. N. Vapnik, “The Nature of Statistical Learning Theory,” Springer,1995.

[14] [14] O. L. Mangasarian and D. R. Musicant, “Lagrangian Support Vector Machines,” J. of Machine Learning Research, Vol.1, pp. 161-177, 2001.

[15] [15] S. Bhatnagar, R. Sutton, M. Ghavamzadeh, and M. Lee, “Natural Actor-Critic Algorithms,” Automatica, Vol.45, No.11, pp. 2471-2482, 2009.

[16] [16] D. W. Scott and S. R. Sain, “Multi-Dimensional Density Estimation,” Handbook of Statistics – Vol 23: Data Mining and Computational Statistics, 2004.

[17] [17] T. Schlegl, M. Buss, and G. Schmidt, “Hybrid Control of Multi-fingered Dexterous Robotic Hands,” S. Engell, G. Frehse, E. Schnieder (Eds.): Modelling, Analysis and Design of Hybrid Systems, LNCIS, Vol.279, pp. 437-465, 2002.

[18] [18] Y. Yin, S. Hosoe, and Z. Luo, “A Mixed Logic Dynamical Modeling Formulation and Optimal Control of Intelligent Robots,” Optimization Engineering, Vol.8, pp. 321-340, 2007.