Adaptive Reinforcement Learning and Its Application to Robot Compliance Learning

Boo-Ho Yang; Haruhiko Asada

doi:10.20965/jrm.1995.p0250

single-rb.php

« previous

JRM Vol.7 No.3 pp. 250-262

(1995)

doi: 10.20965/jrm.1995.p0250

Paper:

Views over last 60 days: 637

Adaptive Reinforcement Learning and Its Application to Robot Compliance Learning

Boo-Ho Yang and Haruhiko Asada

Department of Mechanical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave. Cambridge, MA 02139, U.S.A.

Received:

January 25, 1995

Accepted:

February 15, 1995

Published:

June 20, 1995

Keywords:

Reinforcement learning, Learning control, Adaptive control, compliance control

Abstract

A new learning algorithm for connectionist networks that solves a class of optimal control problems is presented. The algorithm, called Adaptive Reinforcement Learning Algorithm, employs a second network to model immediate reinforcement provided from the task environment and adaptively identities it through repeated experience. Output perturbation and correlation techniques are used to translate mere critic signals into useful learning signals for the connectionist controller. Compared with the direct approaches of reinforcement learning, this algorithm shows faster and guaranteed improvement in the control performance. Robustness against inaccuracy of the model is also discussed. It is demonstrated by simulation that the adaptive reinforcement learning method is efficient and useful in learning a compliance control law in a class of robotic assembly tasks. A simple box palletizing task is used as an example, where a robot is required to move a rectangular part to the corner of a box. In the simulation, the robot is initially provided with only predetermined velocity command to follow the nominal trajectory. At each attempt, the box is randomly located and the part is randomly oriented within the grasp of the end-effector. Therefore, compliant motion control is necessary to guide the part to the corner of the box while avoiding excessive reaction forces caused by the collision with a wall. After repeating the failure in performing the task, the robot can successfully learn force feedback gains to modify its nominal motion. Our results show that the new learning method can be used to learn a compliance control law effectively.

Cite this article as:

B. Yang and H. Asada, “Adaptive Reinforcement Learning and Its Application to Robot Compliance Learning,” J. Robot. Mechatron., Vol.7 No.3, pp. 250-262, 1995.

Data files:

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.