Analyzing Strength-Based Classifier System from Reinforcement Learning Perspective
Atsushi Wada* and Keiki Takadama**,***
*National Institute of Information and Communications Technology, 2-2-2 Hikaridai, Seikacho, Sorakugun, Kyoto, Japan
**The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu, Tokyo, Japan
***PRESTO, Japan Science and Technology Agency (JST), 4-1-8 Honcho Kawaguchi, Saitama 332-0012, Japan
Learning Classifier Systems (LCSs) are rule-based adaptive systems that have both Reinforcement Learning (RL) and rule-discovery mechanisms for effective and practical on-line learning. With the aim of establishing a common theoretical basis between LCSs and RL algorithms to share each field’s findings, a detailed analysis was performed to compare the learning processes of these two approaches. Based on our previous work on deriving an equivalence between the Zeroth-level Classifier System (ZCS) and Q-learning with Function Approximation (FA), this paper extends the analysis to the influence of actually applying the conditions for this equivalence. Comparative experiments have revealed interesting implications: (1) ZCS’s original parameter, the deduction rate, plays a role in stabilizing the action selection, but (2) from the Reinforcement Learning perspective, such a process inhibits the ability to accurately estimate values for the entire state-action space, thus limiting the performance of ZCS in problems requiring accurate value estimation.
-  J. H. Holland, “Adaptation in Natural and Artifical Systems,” The University of Michigan Press, Michigan, 1975.
-  J. H. Holland, “Adaptation,” Progress in Theoretical Biology IV, pp. 263-93, 1976.
-  S. W. Wilson, “Classifier Fitness Based on Accuracy,” Evolutionary Computation, Vol.3, No.2, pp 149-175, 1995.
-  M. V. Butz, D. E. Goldberg, and P. L. Lanzi, “Bounding Learning Time in XCS,” In Proc. of the Genetic and Evolutionary Computation Conf. (GECCO-2004), 2004.
-  M. V. Butz and M. Pelikan, “Analyzing the evolutionary pressures in XCS,” In Proc. of the Genetic and Evolutionary Computation Conf. (GECCO-2001), pp. 935-942, 2001.
-  A. Wada, K. Takadama, K. Shimohara, and O. Katai, “Comparison between Q-Learning and ZCS Learning Classifier System: From aspect of function approximation,” In The 8th Conf. on Intelligent Autonomous Systems, pp. 422-429, 2004.
-  S. W. Wilson, “ZCS: A Zeroth Level Classifier System,” Evolutionary Computation, Vol.2, No.1, pp 1-18, 1994.
-  J. C. H. Watkins, “Learning from Delayed Rewards,” Ph.D. thesis, Cambridge University, 1989.
-  R. Sutton and A. Barto, “An introduction to reinforcement learning,” MIT Press, Cambridge, MA., 1998.
-  R. S. Sutton, “Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding,” In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors(Eds.), Advances in Neural Information Processing Systems, Vol.8, pp. 1038-1044. The MIT Press, 1996.
-  D. E. Goldberg, “Genetic Algorithms in Search, Optimization, and Machine Learning,” Addison-Wesley, MA., 1989.
-  S. W. Wilson, “Get Real! XCS with Continuous-Valued Inputs,” Lecture Notes in Computer Science, Vol.1813, pp 209-222, 2000.
-  D. Cliff and S. Ross, “Adding temporary memory to ZCS”, Adaptive Behavior, Vol.3, No.2, pp. 101-150, 1994.
-  T. Kovacs, “Strength or Accuracy? Fitness Calculation in Learning Classifier Systems,” Vol.1813, Springer-Verlag, March 2000.