JACIII Vol.16 No.6 pp. 733-740
doi: 10.20965/jaciii.2012.p0733


An Adaptation System in Unknown Environments Using a Mixture Probability Model and Clustering Distributions

Uthai Phommasak*, Daisuke Kitakoshi**, and Hiroyuki Shioya*

*Division of Information and Electronics, Graduate School of Engineering, Muroran Institute of Technology, 27-1 Mizumoto, Muroran, Hokkaido 050-8585, Japan

**Department of Information Engineering, Tokyo National College of Technology, 1220-2 Kunugida-machi, Hachioji-shi, Tokyo 193-0997, Japan

February 20, 2012
June 21, 2012
September 20, 2012
reinforcement learning, profit-sharing, mixture probability, Hellinger distance, clustering method

Adaptation to dynamic environments is required in an agent system using Reinforcement Learning (RL). A mixture model of Bayesian network was introduced into the learning system for quickly adapting to such environments. This increases the computational complexity for training the parameters of the system. Therefore, reducing such complexities is necessary when there are limitations in the processing resources. In this paper, we introduce a mixture probability into RL for allowing an agent to adjust to environmental changes. We also introduce a new clustering method that enables one to select fewer elements of the mixture probability in order to reduce the computational complexity and simultaneously maintain the system’s performance. Computer simulations are presented to investigate the effectiveness of our proposed method.

Cite this article as:
Uthai Phommasak, Daisuke Kitakoshi, and Hiroyuki Shioya, “An Adaptation System in Unknown Environments Using a Mixture Probability Model and Clustering Distributions,” J. Adv. Comput. Intell. Intell. Inform., Vol.16, No.6, pp. 733-740, 2012.
Data files:
  1. [1] R. S. Sutton and A. G. Barto, “Reinforcement Learning: An Introduction,” MIR Press, Cambridge, MA, 1998.
  2. [2] T. Croonenborghs, J. Ramon, H. Blockeel, and M. Bruynooghe, “Model-assisted approaches for relational reinforcement learning: some challenges for the SRL community,” Proc. of the ICML-2006 Workshop on Open Problems in Statistical Relational Learning, Pittsburgh, PA, 2006.
  3. [3] F. Fernández and M. Veloso, “Probabilistic policy reuse in a reinforcement learning agent,” Proc. of the fifth Int. Joint Conf. on Autonomous Agents and Multi-Agent Systems, pp. 720-727, 2006.
  4. [4] R. Fung and K. Chang, “Weighting and integrating evidence for stochastic simulation in Bayesian networks,” Uncertainty in Artificial Intelligence, Vol.5, pp. 209-219, 1990.
  5. [5] C. Huang and A. Darwiche, “Inference in belief networks: a procedural guide,” Int. J. of Approximate Reasoning, Vol.15, No.3, pp. 225-263, 1996.
  6. [6] J. Pearl, “Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference,” Morgan Kaufman Pub. Inc., San Francisco, CA, 1988.
  7. [7] D. Kitakoshi, H. Shioya, and R. Nakano, “Adaptation of the Online Policy-Improving System by using a Mixture Model of Bayesian Networks to Dynamic Environments,” Electrics Information and Communication Engineers, Vol.104, No.249, pp. 15-20, 2004.
  8. [8] D. Kitakoshi, H. Shioya, and R. Nakano, “Empirical analysis of an on-line adaptive system using a mixture of Bayesian networks,” Information Science, Vol.180, No.15, pp. 2856-2874, 2010.
  9. [9] F. Tanaka and M. Yamamura, “An approach to lifelong reinforcement learning through multiple environments,” Proc. of the 6th European Workshop on Learning Robots, EWLR-6, pp. 93-99, 1997.
  10. [10] T. Minato andM. Asada, “Environmental change adaptation for mobile robot navigation,” Proc. of IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS’98), pp. 1859-1864, 1998.
  11. [11] E. Hellinger, “Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen,” J. of Reine Angewandte Mathematics, Vol.136, pp. 210-271, 1909.

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Mar. 05, 2021