Rough Sets Based Rule Generation from Data with Categorical and Numerical Values
Hiroshi Sakai*, Kazuhiro Koba*, and Michinori Nakata**
*Department of Mathematics and Computer Aided Science, Faculty of Engineering, Kyushu Institute of Technology
Tobata, Kitakyushu 804-8550, Japan
**Faculty of Management and Information Science, Josai International University
Gumyo, Togane, Chiba 283-8555, Japan
Rough set theory has been mainly applied to data with categorical values. In order to handle data with numerical values in this theory, a familiar concept of ‘wildcards’ was employed, and a new framework of rough sets based rule generation has been proposed. Two characters @ and # were introduced into this framework, and numerical patterns were also defined for numerical values. The concepts of ‘coarse’ and ‘fine’ for rules were explicitly defined according to numerical patterns. This paper enhances the previous framework, and describes the implementation of an utility program. This utility program is applied to the data in UCI Machine Learning Repository, and some useful rules are obtained.
 Z. Pawlak, “Rough Sets: Theoretical Aspects of Reasoning about Data,” Kluwer Academic Publishers, Dordrecht, 1991.
 Z. Pawlak, “Some Issues on Rough Sets,” Transactions on Rough Sets, Springer-Verlag, Vol.1, pp. 1-58, 2004.
 J. Komorowski, Z. Pawlak, L. Polkowski, and A. Skowron, “Rough Sets: a tutorial,” Rough Fuzzy Hybridization, Springer, pp. 3-98, 1999.
 A. Nakamura, S. Tsumoto, H. Tanaka, and S. Kobayashi, “Rough Set Theory and Its Applications,” Journal of Japanese Society for AI, Vol.11, No.2, pp. 209-215, 1996.
 L. Polkowski and A. Skowron (Eds.), “Rough Sets in Knowledge Discovery 1,” Studies in Fuzziness and Soft Computing, Vol.18, Physica-Verlag, 1998.
 L. Polkowski and A. Skowron (Eds.), “Rough Sets in Knowledge Discovery 2,” Studies in Fuzziness and Soft Computing, Vol.19, Physica-Verlag, 1998.
 “Rough Set Software,” Bulletin of Int'l. Rough Set Society, Vol.2, pp. 15-46, 1998.
 H. Sakai and A. Okuma, “Basic Algorithms and Tools for Rough Non-deterministic Information Analysis,” Transactions on Rough Sets, Springer-Verlag, Vol.1, pp. 209-231, 2004.
 H. Sakai, “Effective Procedures for Handling Possible Equivalence Relations in Non-deterministic Information Systems,” Fundamenta Informaticae, Vol.48, pp. 343-362, 2001.
 H. Sakai and M. Nakata, “Discernibility Functions and Minimal Rules in Non-deterministic Information Systems,” Lecture Notes in AI (RSFDGrC 2005), Springer-Verlag, Vol.3641, pp. 254-264, 2005.
 H. Sakai, T. Murai, and M. Nakata, “On a Tool for Rough Non-deterministic Information Analysis and Its Perspective for Handling Numerical Data,” Lecture Notes in AI (MDAI 05), Springer-Verlag, Vol.3558, pp. 203-214, 2005.
 T. Murai, G. Resconi, M. Nakata, and Y. Sato, “Operations of Zooming In and Out on Possible Worlds for Semantic Fields,” E. Damiani et al. (Eds.), Knowledge-Based Intelligent Information Engineering Systems and Allied Technologies, IOS Press, pp. 1083-1087, 2002.
 T. Murai, G. Resconi, M. Nakata, and Y. Sato, “Granular Reasoning Using Zooming In & Out,” Lecture Notes in Computer Science (RSFDGrC 2003), Springer-Verlag, Vol.2639, pp. 421-424, 2003.
 Y. Yao, C. Liau, and N. Zhong, “Granular Computing Based on Rough Sets, Quotient Space Theory, and Belief Functions,” Lecture Notes in AI (ISMIS 2003), Springer-Verlag, Vol.2871, pp. 152-159, 2003.
 A. Skowron and C. Rauszer, “The Discernibility Matrices and Functions in Information Systems,” In Intelligent Decision Support – Handbook of Advances and Applications of the Rough Set Theory, Kluwer Academic Publishers, pp. 331-362, 1992.
 R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules,” Proc. 20th Very Large Data Base, pp. 487-499, 1994.
 R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. Verkamo, “Fast Discovery of Association Rules,” Advances in Knowledge Discovery and Data Mining, pp. 307-328, 1996.
 UCI Machine Repository: http://www.ics.uci.edu/verb++mlearn/MLRepository.html.
 M. Chmielewski and J. Grzymala-Busse, “Global Discretization of Continuous Attributes as Preprocessing for Machine Learning,” Int'l. Journal of Approximate Reasoning, Vol.15, pp. 319-331, 1996.
 J. Grzymala-Busse and J. Stefanowski, “Three Discretization Methods for Rule Induction,” Int'l. Journal of Intelligent Systems, Vol.16, pp. 29-38, 2001.