Is Gradient Descent Update Consistent with Accuracy-Based Learning Classifier System?
Atsushi Wada* and Keiki Takadama**,***
*National Institute of Information and Communications Technology, 2-2-2 Hikaridai, Seikacho, Sorakugun, Kyoto, Japan
**The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu, Tokyo, Japan
***PRESTO, Japan Science and Technology Agency (JST), 4-1-8 Honcho Kawaguchi, Saitama 332-0012, Japan
Learning Classifier Systems (LCSs) are rule-based adaptive systems that have both Reinforcement Learning (RL) and rule-discovery mechanisms for effective and practical online learning. An analysis of the reinforcement process of XCS, one of the currently mainstream LCSs, is performed from the aspect of RL. Upon comparing XCS’s update method with gradient-descent-based parameter update in RL, differences are found in the following elements: (1) residual term, (2) gradient term, and (3) payoff definition. All possible combinations of the variants in each element are implemented and tested on multi-step benchmark problems. This revealed that few specific combinations work effectively with XCS’s accuracy-based rule-discovery process, while pure gradient-descent-based update showed the worst performance.
-  R. Sutton and A. Barto, “An introduction to reinforcement learning,” MIT Press, Cambridge, MA., 1998.
-  J. H. Holland, “Adaptation in Natural and Artificial Systems,” The University of Michigan Press, Michigan, 1975.
-  J. H. Holland, “Adaptation,” Progress in Theoretical Biology, IV, pp. 263-293, 1976.
-  M. V. Butz, T. Kovacs, P. L. Lanzi, and S. W. Wilson, “Toward a Theory of Generalization and Learning in XCS,” IEEE Trans. on Evolutionary Computation, Vol.8, No.1, pp. 28-46, 2004.
-  M. V. Butz and M. Pelikan, “Analyzing the evolutionary pressures in XCS,” In L. Spector and E. D. Goodman (Eds.), Proc. of the Genetic and Evolutionary Computation Conf. (GECCO-2001), pp. 935-942, San Francisco, CA, 2001, Morgan, Kaufmann.
-  T. Kovacs, “Evolving Optimal Populations with XCS Classifier Systems,” Technical Report CSRP-96-17, University of Birmingham, School of Computer Science, October 1996.
-  A. Wada, K. Takadama, K. Shimohara, and O. Katai, “Comparing Learning Classifier System and Reinforcement Learning with Function Approximation,” IEEJ Trans. on Electronics, Information and Systems, Vol.124, No.10, pp. 2034-2039, 2004.
-  S. W. Wilson, “ZCS: A zeroth level classifier system,” Evolutionary Computation, Vol.2, No.1, pp. 1-18, 1994.
-  J. C. H. Watkins, “Learning from delayed rewards,” Ph.D. thesis, Cambridge University, 1989.
-  J. C. H. Watkins and P. Dayan, “Technical note: Q-learning,” Machine Learning, Vol.8, pp. 279-292, 1992.
-  S. W. Wilson, “Classifier fitness based on accuracy,” Evolutionary Computation, Vol.3, No.2, pp. 149-175, 1995.
-  T. Kovacs, “Two Views of Classifier Systems,” In Fourth Int. Workshop on Learning Classifier Systems - IWLCS-2001, pp. 367-371, San Francisco, California, USA, 7 2001.
-  P. L. Lanzi, “Learning classifier systems from a reinforcement learning perspective,” Soft Computing, Vol.6, pp. 162-170, 2002.
-  M. V. Butz, D. E. Goldberg, and P. L. Lanzi, “Gradient Descent Methods in Learning Classifier Systems: Improving XCS Performance in Multistep Problems,” IEEE Trans. on Evolutionary Computation, Vol.9, No.5, pp. 452-473, 2005.
-  S. W. Wilson, “Generalization in the XCS Classifier System,” In J. R. Koza (Ed.), Genetic Programming 1998: Proc. of the Third Annual Conf., pp. 665-674, San Francisco, CA, 1998, Morgan Kaufmann.
-  T. Kovacs, “Deletion Schemes for Classifier Systems,” In Proc. of the Genetic and Evolutionary Computation Conf., Vol.1, pp. 329-336, Orlando, Florida, USA, 13-17 1999, Morgan Kaufmann.
-  P. L. Lanzi, “Extending the Representation of Classifier Conditions Part I: From Binary to Messy Coding,” In Proc. of the Genetic and Evolutionary Computation Conference (GECCO 99), pp. 337-344, Morgan Kaufmann, July 1999.
-  P. L. Lanzi and A. Perrucci, “Extending the Representation of Classifier Conditions Part II: From Messy Coding to S-Expressions,” In Proc. of the Genetic and Evolutionary Computation Conf. (GECCO 99), pp. 345-352. Morgan Kaufmann, July 1999.
-  S. W. Wilson, “Get real! XCS with continuous-valued inputs,” In Learning classifier systems, from foundations to applications, pp. 209-222, Springer-Verlag, London, UK, 2000.
-  S. W. Wilson, “Mining Oblique Data with XCS,” Lecture Notes in Computer Science, 1996, pp. 158-176, 2000.
-  M. V. Butz and S. W. Wilson, “An algorithmic Description of XCS,” In Advances in Learning Classifier Systems, Forth Int. Workshop, IWLCS 2001, pp. 253-272, Springer-Verlag, Berlin, Germany, 2002.