Classification Rule Extraction Based on Relevant, Irredundant Attributes and Rule Enlargement
George Lashkia*, Laurence Anthony**, and Hiroyasu Koshimizu*
*School of Information Science and Technology, Chukyo University, 101 Tokodate, Kaizu-cho, Toyota 470-0393, Japan
**School of Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
In this paper we focus on the induction of classification rules from examples. Conventional algorithms fail in discovering effective knowledge when the database contains irrelevant information. We present a new rule extraction method, RGT, which tackles this problem by employing only relevant and irredundant attributes. Simplicity of rules is also our major concern. In order to create simple rules, we estimate the purity of patterns and propose a rule enlargement approach, which consists of rule merging and rule expanding procedures. In this paper, we describe the methodology for the RGT algorithm, discuss its properties, and compare it with conventional methods.
-  M. Bramer, “Induction of classification rules from examples: a critical review,” Proc. Data Mining’96, Unicorn Seminar, London, pp. 139-166, 1996.
-  G. Lashkia and L. Anthony, “An Inductive learning method for medical diagnosis,” Pattern Recognition Letters, Vol.24, pp. 273-282, 2003.
-  J. Quinlan, “C4.5: Programs for machine learning,” Morgan Kaufmann, 1993.
-  R. Rivest, “Learning decision lists,” Machine Learning, Vol.2, pp. 229-246, 1987.
-  P. Clark and T. Niblett, “The CN2 induction algorithm,” Machine Learning, Vol.3, pp. 261-283, 1989.
-  R. Michalski, I. Mozetic, J. Hong, and H. Lavrac, “The multipurpose incremental learning system AQ15 and its testing application to three medical domains,” Proc. of Fifth National Conference on AI, pp. 1041-1045, 1986.
-  H. Almuallim and T. Diettrich, “Learning with many irrelevant features,” Proc. of the 9th National Conference on Artificial Intelligence, pp. 547-552, 1991.
-  J. John, R. Kohavi, and K. Pfleger, “Irrelevant features and the subset selection problem,” Proc. of the Eleventh International Conference on Machine Learning, pp. 121-129, 1994.
-  K. Kira and L. Rendel, “A practical approach to feature selection,” Proc. of the Ninth International Conference on Machine Learning, pp. 249-256, 1992.
-  G. Lashkia and L. Anthony, “Relevant, irredundant feature selection and noisy example elimination,” IEEE Trans. Syst. Man, and Cybernet., B, Vol.34, No.2, pp. 888-897, 2004.
-  E. McCluskey, “Introduction to the theory of Switching Circuits,” McGraw-Hill Book Co., 1965.
-  A. Klose, A. Nurnberger, and D. Nauck, “Some approach to improve the interpretability of neuro-fuzzy classifiers,” Proc. of the Sixth European Congress on Intelligent Techniques and Soft Computing (EUFIT’98), pp. 629-633, 1998.
-  J. Stoffel, “A classifier design technique for discrete variable pattern recognition problems,” IEEE Trans. Comput., Vol.23, pp. 428-441, 1974.
-  M. Kudo and M. Shimbo, “Optimal subclasses with dichotomous variables for feature selection and discrimination,” IEEE Trans. Syst. Man, and Cybernet., Vol.19, pp. 1194-1199, 1989.
-  M. Kudo and M. Shimbo, “Analysis of the structure of classes and its applications – subclass approach,” Current Topics in Pattern Recognition Research, Vol.1, pp. 69-81, 1994.
-  M. Kudo, S. Yanagi, and M. Shimbo, “Construction of class regions by a randomized algorithm: a randomized subclass method,” Pattern Recognition, Vol.29, No.4, pp. 581-588, 1996.
-  D. Nauck and R. Kruse, “New learning strategies for NEFCLASS,” Proc. of the Seventh International Fuzzy Systems Association World Congress, IFSA’97, pp. 50-55, 1997.
-  V. P. Nelson, T. H. Troy, and J. D. Irwin, “Digital logic circuit analysis and design,” Prentice Hall, 1995.
-  T. Mitchell, “Machine Learning,” McGraw-Hill, 1997.
-  J. Quinlan, “Induction of decision trees,” Machine Learning, Vol.1, pp. 81-106, 1986.
-  I. Witten and E. Frank, “Data mining,” Elsevier, 2005.