Paper:

# Learning Classifier System Based on Mean of Reward

## Takato Tatsumi, Hiroyuki Sato, and Keiki Takadama

The University of Electro-Communications

1-5-1 Chofugaoka, Chofu-shi, Tokyo 182-8585, Japan

This paper focuses on the generalization of classifiers in noisy problems and aims at construction learning classifier system (LCS) that can acquire the optimal classifier subset by dynamically determining the classifier generalization criteria. In this paper, an accuracy-based LCS (XCS) that uses the mean of the reward (XCS-MR) is introduced, which can correctly identify classifiers as either accurate or inaccurate for noisy problems, and investigates its effectiveness when used for several noisy problems. Applying XCS and an XCS based on the variance of reward (XCS-VR) as the conventional LCSs, along with XCS-MR, to noisy 11-multiplexer problems where the reward value changes according to a Gaussian distribution, Cauchy distribution, and lognormal distribution revealed the following: (1) XCS-VR and XCS-MR could select the correct action for every type of reward distribution; (2) XCS-MR could appropriately generalize the classifiers with the smallest amount of data; and (3) XCS-MR could acquire the optimal classifier subset in every trial for every type of reward distribution.

*J. Adv. Comput. Intell. Intell. Inform.*, Vol.21, No.5, pp. 895-906, 2017.

- [1] J. H. Holland, “Escaping Brittleness: The Possibilities of General- Purpose Learning Algorithms Applied to Parallel Rule-Based Systems,” Machine learning, pp. 593.623, 1986.
- [2] R. S. Sutton, “Learning to Predict by the Methods of Temporal Differences,” Machine Learning, Vol.3, No.1, pp. 9-44, 1988.
- [3] D. E. Goldberg, “Genetic Algorithms in Search, Optimization and Machine Learning,” Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1st Ed., 1989.
- [4] S. W. Wilson, “Classifier Fitness Based on Accuracy,” Evol. Comput., Vol.3, No.2, pp. 149-175, June 1995.
- [5] T. Tatsumi, T. Komine, M. Nakata, H. Sato, and K. Takadama, “A Learning Classifier System that Adapts Accuracy Criterion,” Trans. of the Japanese Society for Evolutionary Computation, Vol.6, No.2, pp. 90-103, 2015.
- [6] T. Tatsumi, T. Komine, M. Nakata, H. Sato, T. Kovacs, and K. Takadama, “Variance-based Learning Classifier System Without Convergence of Reward Estimation,” Proc. of the 2016 on Genetic and Evolutionary Computation Conference Companion, pp. 67-68, 2016.
- [7] M. V. Butz, K. Sastry, and D. E. Goldberg, “Tournament Selection: Stable Fitness Pressure in XCS,” Genetic and Evolutionary Computation (GECCO 2003), pp. 1857-1869, Springer, 2003.
- [8] M. Nakata, F. Sato, and K. Takadama, “Towards Generalization by Identification-based XCS in Multi-steps Problem,” In Nature and Biologically Inspired Computing (NaBIC), 2011 3rd World Congress on, pp. 389-394, IEEE, 2011.
- [9] M. Nakata, P. Lanzi, and K. Takadama, “Rule Reduction by Selection Strategy in XCS with Adaptive Action Map,” Evolutionary Intelligence, pp. 1-17, 2015.
- [10] P. L. Lanzi and M. Colombetti, “An Extension to the XCS Classifier System for Stochastic Environments,” Proc. of the Genetic and Evolutionary Computation Conf. (GECCO-99), pp. 353-360, 1999.
- [11] T. Kovacs, “Strength or Accuracy: Credit Assignment in Learning Classifier Systems,” Springer Verlag, 2003.
- [12] M. V. Butz, T. Kovacs, P. L. Lanzi, and S. W. Wilson, “Toward a Theory of Generalization and Learning in XCS,” Evolutionary Computation, IEEE Trans. on, Vol.8, No.1, pp. 28-46, 2004.
- [13] P. L. Lanzi, “An Analysis of Generalization in the XCS Classifier System,” Evolutionary Computation J., Vol.7, No.2, pp. 125-149, 1999.