Adaptive Modeling of Physical Systems Based on Affine Transform  and its Application for Machine Learning

Shingo Nakamura; Shuji Hashimoto

doi:10.20965/jrm.2008.p0750

single-rb.php

« previous

JRM Vol.20 No.5 pp. 750-756

doi: 10.20965/jrm.2008.p0750

(2008)

Paper:

Views over last 60 days: 506

Adaptive Modeling of Physical Systems Based on Affine Transform and its Application for Machine Learning

Shingo Nakamura and Shuji Hashimoto

Department of Applied Physics, Waseda University, Tokyo, Japan

Received:

February 16, 2008

Accepted:

July 24, 2008

Published:

October 20, 2008

Keywords:

Simulator Building, Reinforcement Learning, Neural Network, Swing-Up Pendulum, Affine Transform

Abstract

We describe the adaptive modeling of a physical system using the affine transform and its application to machine learning. We previously proposed a method to implement machine learning in physical hardware, where we built a simulator based on actual hardware input/output, and used it to optimize a controller. The method decreases stress on hardware because the controller is optimized by software via the simulator. Moreover, it does not require specific physical information on hardware. We also did not need to formulate hardware kinematics. When hardware changes, however, optimization must be redone to build the simulator -a clearly inefficient procedure. We therefore considered using previous optimization results when reoptimizing for new hardware. In the physical system, the aspect of the phase space does not vary much if the system structure remains the same. We applied affine transform to phase space of the physical system, to remodel the simulator for new hardware characteristics triggered by parameter changes. We used the remodeled simulator in machine learning to reoptimize the controller. In experiments, we used the swing-up pendulum problem to evaluate our proposal, comparing our proposal and original methods and finding that our proposal accelerates reoptimization.

Cite this article as:

S. Nakamura and S. Hashimoto, “Adaptive Modeling of Physical Systems Based on Affine Transform and its Application for Machine Learning,” J. Robot. Mechatron., Vol.20 No.5, pp. 750-756, 2008.

Data files:

References

[1] K. Doya, “Efficient Nonlinear Control with Actor-Tutor Architecture,” Advances in Neural Information Processing System 9, pp. 1012-1018, 1996.
[2] K. Iguchi, H. Kimura, and S. Kobayashi, “GA-based Control for Swinging up and Stabilizing Parallel Double Inverted Pendulums,” Proc. of the 13th SICE Symposium on Decentralized Autonomous Systems, pp. 277-282, 2001.
[3] K. Doya, K. Samejima, K. Katagiri, and M. Kawato, “Multiple model-based reinforcement learning,” Neural Comput., Vol.14, No.6, pp. 1347-1369, 2002.
[4] Y. Xu, M. Iwase, and K. Furuta, “Time Optimal Swing-up Control of Single Pendulum,” Trans. of ASME, Journal of Dynamics Systems, Measurement and Control, Vol.123, No.5, pp. 518-527, 2001.
[5] K. Yoshida, “Swing-up control of an inverted pendulum by energybased methods,” Proc. of the American Control Conf., pp. 4045-4047, 1999.
[6] K. J. Astrom and K. Furuta, “Swing-up a pendulum by a energy control,” Automatica, Vol.36, pp. 287-295, 2000.
[7] M. Bugeja, “Non-linear swing-up and stabilizing control of an inverted pendulum system,” Proc. IEEE Region 8 EUROCON 2003, 2003.
[8] J. C. Bongard and H. Lipson, “Automated Damage Diagnosis and Recovery for Remote Robotics,” Proc. of the 2004 Int. Conf. on Robotics and Automation, pp. 3545-3550, 2004.
[9] H. Kimura, T. Yamashita, and S. Kobayashi, “Reinforcement learning of walking behavior for a four-legged robot,” Proc. of the 40th IEEE Conf. on Decision and Control, pp. 411-416, 2001.
[10] H. Kimura and S. Kobayashi, “Reinforcement Learning for Crawling Robot Motion Using Stochastic Gradient Ascent,” Journal of Japanese Society for Artificial Intelligence, Vol.14, No.1, pp. 122-130, 1999.
[11] S. Nakamura, R. Saegusa, and S. Hashimoto, “Hybrid Learning Strategy for Real Hardware of Swing-up Pendulum,” JACIII, Vol.11, No.8, pp. 972-978, 2007.
[12] M. F. Speider, S. Nakamura, and S. Hashimoto, “Crossing the reality gap for a swing-up pendulum,” Proc. of the 2006 IEICE General Conf., CD-Proc, D-2-12, 2006.
[13] R. S. Sutton and A. G. Barto, “Reinforcement Learning: An Introduction,” A Bradford Book, The MIT Press, 1988.
[14] A. G. Barto, R. S. Sutton, and C. W. Anderson, “Neuronlike adaptive elements that can solve difficult learning control problems,” IEEE Trans. Syst.Man. & Cybern, Vol.SMC-13, pp. 835-846, 1983.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] K. Doya, “Efficient Nonlinear Control with Actor-Tutor Architecture,” Advances in Neural Information Processing System 9, pp. 1012-1018, 1996.

[2] [2] K. Iguchi, H. Kimura, and S. Kobayashi, “GA-based Control for Swinging up and Stabilizing Parallel Double Inverted Pendulums,” Proc. of the 13th SICE Symposium on Decentralized Autonomous Systems, pp. 277-282, 2001.

[3] [3] K. Doya, K. Samejima, K. Katagiri, and M. Kawato, “Multiple model-based reinforcement learning,” Neural Comput., Vol.14, No.6, pp. 1347-1369, 2002.

[4] [4] Y. Xu, M. Iwase, and K. Furuta, “Time Optimal Swing-up Control of Single Pendulum,” Trans. of ASME, Journal of Dynamics Systems, Measurement and Control, Vol.123, No.5, pp. 518-527, 2001.

[5] [5] K. Yoshida, “Swing-up control of an inverted pendulum by energybased methods,” Proc. of the American Control Conf., pp. 4045-4047, 1999.

[6] [6] K. J. Astrom and K. Furuta, “Swing-up a pendulum by a energy control,” Automatica, Vol.36, pp. 287-295, 2000.

[7] [7] M. Bugeja, “Non-linear swing-up and stabilizing control of an inverted pendulum system,” Proc. IEEE Region 8 EUROCON 2003, 2003.

[8] [8] J. C. Bongard and H. Lipson, “Automated Damage Diagnosis and Recovery for Remote Robotics,” Proc. of the 2004 Int. Conf. on Robotics and Automation, pp. 3545-3550, 2004.

[9] [9] H. Kimura, T. Yamashita, and S. Kobayashi, “Reinforcement learning of walking behavior for a four-legged robot,” Proc. of the 40th IEEE Conf. on Decision and Control, pp. 411-416, 2001.

[10] [10] H. Kimura and S. Kobayashi, “Reinforcement Learning for Crawling Robot Motion Using Stochastic Gradient Ascent,” Journal of Japanese Society for Artificial Intelligence, Vol.14, No.1, pp. 122-130, 1999.

[11] [11] S. Nakamura, R. Saegusa, and S. Hashimoto, “Hybrid Learning Strategy for Real Hardware of Swing-up Pendulum,” JACIII, Vol.11, No.8, pp. 972-978, 2007.

[12] [12] M. F. Speider, S. Nakamura, and S. Hashimoto, “Crossing the reality gap for a swing-up pendulum,” Proc. of the 2006 IEICE General Conf., CD-Proc, D-2-12, 2006.

[13] [13] R. S. Sutton and A. G. Barto, “Reinforcement Learning: An Introduction,” A Bradford Book, The MIT Press, 1988.

[14] [14] A. G. Barto, R. S. Sutton, and C. W. Anderson, “Neuronlike adaptive elements that can solve difficult learning control problems,” IEEE Trans. Syst.Man. & Cybern, Vol.SMC-13, pp. 835-846, 1983.