Analyzing Strength-Based Classifier System from Reinforcement Learning Perspective

Atsushi Wada; Keiki Takadama

doi:10.20965/jaciii.2009.p0631

single-jc.php

« previous

JACIII Vol.13 No.6 pp. 631-639

doi: 10.20965/jaciii.2009.p0631

(2009)

Paper:

Views over last 60 days: 621

Analyzing Strength-Based Classifier System from Reinforcement Learning Perspective

Atsushi Wada^* and Keiki Takadama^,*

^*National Institute of Information and Communications Technology, 2-2-2 Hikaridai, Seikacho, Sorakugun, Kyoto, Japan

^**The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu, Tokyo, Japan

^***PRESTO, Japan Science and Technology Agency (JST), 4-1-8 Honcho Kawaguchi, Saitama 332-0012, Japan

Received:

April 30, 2009

Accepted:

July 31, 2009

Published:

November 20, 2009

Keywords:

learning classifier systems, strength-based, ZCS, reinforcement learning

Abstract

Learning Classifier Systems (LCSs) are rule-based adaptive systems that have both Reinforcement Learning (RL) and rule-discovery mechanisms for effective and practical on-line learning. With the aim of establishing a common theoretical basis between LCSs and RL algorithms to share each field's findings, a detailed analysis was performed to compare the learning processes of these two approaches. Based on our previous work on deriving an equivalence between the Zeroth-level Classifier System (ZCS) and Q-learning with Function Approximation (FA), this paper extends the analysis to the influence of actually applying the conditions for this equivalence. Comparative experiments have revealed interesting implications: (1) ZCS's original parameter, the deduction rate, plays a role in stabilizing the action selection, but (2) from the Reinforcement Learning perspective, such a process inhibits the ability to accurately estimate values for the entire state-action space, thus limiting the performance of ZCS in problems requiring accurate value estimation.

Cite this article as:

A. Wada and K. Takadama, “Analyzing Strength-Based Classifier System from Reinforcement Learning Perspective,” J. Adv. Comput. Intell. Intell. Inform., Vol.13 No.6, pp. 631-639, 2009.

Data files:

References

[1] J. H. Holland, “Adaptation in Natural and Artifical Systems,” The University of Michigan Press, Michigan, 1975.
[2] J. H. Holland, “Adaptation,” Progress in Theoretical Biology IV, pp. 263-93, 1976.
[3] S. W. Wilson, “Classifier Fitness Based on Accuracy,” Evolutionary Computation, Vol.3, No.2, pp 149-175, 1995.
[4] M. V. Butz, D. E. Goldberg, and P. L. Lanzi, “Bounding Learning Time in XCS,” In Proc. of the Genetic and Evolutionary Computation Conf. (GECCO-2004), 2004.
[5] M. V. Butz and M. Pelikan, “Analyzing the evolutionary pressures in XCS,” In Proc. of the Genetic and Evolutionary Computation Conf. (GECCO-2001), pp. 935-942, 2001.
[6] A. Wada, K. Takadama, K. Shimohara, and O. Katai, “Comparison between Q-Learning and ZCS Learning Classifier System: From aspect of function approximation,” In The 8th Conf. on Intelligent Autonomous Systems, pp. 422-429, 2004.
[7] S. W. Wilson, “ZCS: A Zeroth Level Classifier System,” Evolutionary Computation, Vol.2, No.1, pp 1-18, 1994.
[8] J. C. H. Watkins, “Learning from Delayed Rewards,” Ph.D. thesis, Cambridge University, 1989.
[9] R. Sutton and A. Barto, “An introduction to reinforcement learning,” MIT Press, Cambridge, MA., 1998.
[10] R. S. Sutton, “Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding,” In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors(Eds.), Advances in Neural Information Processing Systems, Vol.8, pp. 1038-1044. The MIT Press, 1996.
[11] D. E. Goldberg, “Genetic Algorithms in Search, Optimization, and Machine Learning,” Addison-Wesley, MA., 1989.
[12] S. W. Wilson, “Get Real! XCS with Continuous-Valued Inputs,” Lecture Notes in Computer Science, Vol.1813, pp 209-222, 2000.
[13] D. Cliff and S. Ross, “Adding temporary memory to ZCS”, Adaptive Behavior, Vol.3, No.2, pp. 101-150, 1994.
[14] T. Kovacs, “Strength or Accuracy? Fitness Calculation in Learning Classifier Systems,” Vol.1813, Springer-Verlag, March 2000.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] J. H. Holland, “Adaptation in Natural and Artifical Systems,” The University of Michigan Press, Michigan, 1975.

[2] [2] J. H. Holland, “Adaptation,” Progress in Theoretical Biology IV, pp. 263-93, 1976.

[3] [3] S. W. Wilson, “Classifier Fitness Based on Accuracy,” Evolutionary Computation, Vol.3, No.2, pp 149-175, 1995.

[4] [4] M. V. Butz, D. E. Goldberg, and P. L. Lanzi, “Bounding Learning Time in XCS,” In Proc. of the Genetic and Evolutionary Computation Conf. (GECCO-2004), 2004.

[5] [5] M. V. Butz and M. Pelikan, “Analyzing the evolutionary pressures in XCS,” In Proc. of the Genetic and Evolutionary Computation Conf. (GECCO-2001), pp. 935-942, 2001.

[6] [6] A. Wada, K. Takadama, K. Shimohara, and O. Katai, “Comparison between Q-Learning and ZCS Learning Classifier System: From aspect of function approximation,” In The 8th Conf. on Intelligent Autonomous Systems, pp. 422-429, 2004.

[7] [7] S. W. Wilson, “ZCS: A Zeroth Level Classifier System,” Evolutionary Computation, Vol.2, No.1, pp 1-18, 1994.

[8] [8] J. C. H. Watkins, “Learning from Delayed Rewards,” Ph.D. thesis, Cambridge University, 1989.

[9] [9] R. Sutton and A. Barto, “An introduction to reinforcement learning,” MIT Press, Cambridge, MA., 1998.

[10] [10] R. S. Sutton, “Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding,” In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors(Eds.), Advances in Neural Information Processing Systems, Vol.8, pp. 1038-1044. The MIT Press, 1996.

[11] [11] D. E. Goldberg, “Genetic Algorithms in Search, Optimization, and Machine Learning,” Addison-Wesley, MA., 1989.

[12] [12] S. W. Wilson, “Get Real! XCS with Continuous-Valued Inputs,” Lecture Notes in Computer Science, Vol.1813, pp 209-222, 2000.

[13] [13] D. Cliff and S. Ross, “Adding temporary memory to ZCS”, Adaptive Behavior, Vol.3, No.2, pp. 101-150, 1994.

[14] [14] T. Kovacs, “Strength or Accuracy? Fitness Calculation in Learning Classifier Systems,” Vol.1813, Springer-Verlag, March 2000.

Analyzing Strength-Based Classifier System from Reinforcement Learning Perspective

Atsushi Wada* and Keiki Takadama**,***

Atsushi Wada^* and Keiki Takadama^,*