An Adaptation System in Unknown Environments Using a Mixture Probability Model and Clustering Distributions

Uthai Phommasak; Daisuke Kitakoshi; Hiroyuki Shioya

doi:10.20965/jaciii.2012.p0733

single-jc.php

« previous

JACIII Vol.16 No.6 pp. 733-740

doi: 10.20965/jaciii.2012.p0733

(2012)

Paper:

Views over last 60 days: 508

An Adaptation System in Unknown Environments Using a Mixture Probability Model and Clustering Distributions

Uthai Phommasak^*, Daisuke Kitakoshi^**, and Hiroyuki Shioya^*

^*Division of Information and Electronics, Graduate School of Engineering, Muroran Institute of Technology, 27-1 Mizumoto, Muroran, Hokkaido 050-8585, Japan

^**Department of Information Engineering, Tokyo National College of Technology, 1220-2 Kunugida-machi, Hachioji-shi, Tokyo 193-0997, Japan

Received:

February 20, 2012

Accepted:

June 21, 2012

Published:

September 20, 2012

Keywords:

reinforcement learning, profit-sharing, mixture probability, Hellinger distance, clustering method

Abstract

Adaptation to dynamic environments is required in an agent system using Reinforcement Learning (RL). A mixture model of Bayesian network was introduced into the learning system for quickly adapting to such environments. This increases the computational complexity for training the parameters of the system. Therefore, reducing such complexities is necessary when there are limitations in the processing resources. In this paper, we introduce a mixture probability into RL for allowing an agent to adjust to environmental changes. We also introduce a new clustering method that enables one to select fewer elements of the mixture probability in order to reduce the computational complexity and simultaneously maintain the system’s performance. Computer simulations are presented to investigate the effectiveness of our proposed method.

Cite this article as:

U. Phommasak, D. Kitakoshi, and H. Shioya, “An Adaptation System in Unknown Environments Using a Mixture Probability Model and Clustering Distributions,” J. Adv. Comput. Intell. Intell. Inform., Vol.16 No.6, pp. 733-740, 2012.

Data files:

References

[1] R. S. Sutton and A. G. Barto, “Reinforcement Learning: An Introduction,” MIR Press, Cambridge, MA, 1998.
[2] T. Croonenborghs, J. Ramon, H. Blockeel, and M. Bruynooghe, “Model-assisted approaches for relational reinforcement learning: some challenges for the SRL community,” Proc. of the ICML-2006 Workshop on Open Problems in Statistical Relational Learning, Pittsburgh, PA, 2006.
[3] F. Fernández and M. Veloso, “Probabilistic policy reuse in a reinforcement learning agent,” Proc. of the fifth Int. Joint Conf. on Autonomous Agents and Multi-Agent Systems, pp. 720-727, 2006.
[4] R. Fung and K. Chang, “Weighting and integrating evidence for stochastic simulation in Bayesian networks,” Uncertainty in Artificial Intelligence, Vol.5, pp. 209-219, 1990.
[5] C. Huang and A. Darwiche, “Inference in belief networks: a procedural guide,” Int. J. of Approximate Reasoning, Vol.15, No.3, pp. 225-263, 1996.
[6] J. Pearl, “Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference,” Morgan Kaufman Pub. Inc., San Francisco, CA, 1988.
[7] D. Kitakoshi, H. Shioya, and R. Nakano, “Adaptation of the Online Policy-Improving System by using a Mixture Model of Bayesian Networks to Dynamic Environments,” Electrics Information and Communication Engineers, Vol.104, No.249, pp. 15-20, 2004.
[8] D. Kitakoshi, H. Shioya, and R. Nakano, “Empirical analysis of an on-line adaptive system using a mixture of Bayesian networks,” Information Science, Vol.180, No.15, pp. 2856-2874, 2010.
[9] F. Tanaka and M. Yamamura, “An approach to lifelong reinforcement learning through multiple environments,” Proc. of the 6th European Workshop on Learning Robots, EWLR-6, pp. 93-99, 1997.
[10] T. Minato andM. Asada, “Environmental change adaptation for mobile robot navigation,” Proc. of IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS’98), pp. 1859-1864, 1998.
[11] E. Hellinger, “Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen,” J. of Reine Angewandte Mathematics, Vol.136, pp. 210-271, 1909.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] R. S. Sutton and A. G. Barto, “Reinforcement Learning: An Introduction,” MIR Press, Cambridge, MA, 1998.

[2] [2] T. Croonenborghs, J. Ramon, H. Blockeel, and M. Bruynooghe, “Model-assisted approaches for relational reinforcement learning: some challenges for the SRL community,” Proc. of the ICML-2006 Workshop on Open Problems in Statistical Relational Learning, Pittsburgh, PA, 2006.

[3] [3] F. Fernández and M. Veloso, “Probabilistic policy reuse in a reinforcement learning agent,” Proc. of the fifth Int. Joint Conf. on Autonomous Agents and Multi-Agent Systems, pp. 720-727, 2006.

[4] [4] R. Fung and K. Chang, “Weighting and integrating evidence for stochastic simulation in Bayesian networks,” Uncertainty in Artificial Intelligence, Vol.5, pp. 209-219, 1990.

[5] [5] C. Huang and A. Darwiche, “Inference in belief networks: a procedural guide,” Int. J. of Approximate Reasoning, Vol.15, No.3, pp. 225-263, 1996.

[6] [6] J. Pearl, “Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference,” Morgan Kaufman Pub. Inc., San Francisco, CA, 1988.

[7] [7] D. Kitakoshi, H. Shioya, and R. Nakano, “Adaptation of the Online Policy-Improving System by using a Mixture Model of Bayesian Networks to Dynamic Environments,” Electrics Information and Communication Engineers, Vol.104, No.249, pp. 15-20, 2004.

[8] [8] D. Kitakoshi, H. Shioya, and R. Nakano, “Empirical analysis of an on-line adaptive system using a mixture of Bayesian networks,” Information Science, Vol.180, No.15, pp. 2856-2874, 2010.

[9] [9] F. Tanaka and M. Yamamura, “An approach to lifelong reinforcement learning through multiple environments,” Proc. of the 6th European Workshop on Learning Robots, EWLR-6, pp. 93-99, 1997.

[10] [10] T. Minato andM. Asada, “Environmental change adaptation for mobile robot navigation,” Proc. of IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS’98), pp. 1859-1864, 1998.

[11] [11] E. Hellinger, “Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen,” J. of Reine Angewandte Mathematics, Vol.136, pp. 210-271, 1909.

An Adaptation System in Unknown Environments Using a Mixture Probability Model and Clustering Distributions

Uthai Phommasak*, Daisuke Kitakoshi**, and Hiroyuki Shioya*

Uthai Phommasak^*, Daisuke Kitakoshi^**, and Hiroyuki Shioya^*