Paper:

# Rough Set Model in Incomplete Decision Systems

## Thinh Cao^{*}, Koichi Yamada^{*}, Muneyuki Unehara^{*}, Izumi Suzuki^{*}, and Do Van Nguyen^{**,***}

^{*}Information Science and Control Engineering, Nagaoka University of Technology

Nagaoka, Japan

^{**}Institute of Information Technology, MIST

Hanoi, Vietnam

^{***}HMI Lab, UET, Vietnam National University

Hanoi, Vietnam

The paper introduces a rough set model to analyze an information system in which some conditions and decision data are missing. Many studies have focused on missing condition data, but very few have accounted for missing decision data. Common approaches tend to remove objects with missing decision data because such objects are apparently considered worthless from the perspective of decision-making. However, we indicate that this removal may lead to information loss. Our method retains such objects with missing decision data. We observe that a scenario involving missing decision data is somewhat similar to the situation of semi-supervised learning, because some objects are characterized by complete decision data whereas others are not. This leads us to the idea of estimating potential candidates for the missing data using the available data. These potential candidates are determined by two quantitative indicators: local decision probability and universal decision probability. These potential candidates allow us to define set approximations and the definition of reduct. We also compare the reducts and rules induced from two information systems: one removes objects with missing decision data and the other retains such objects. We highlight that the knowledge induced from the former can be induced from the latter using our approach. Thus, our method offers a more generalized approach to handle missing decision data and prevents information loss.

- [1] Z. Pawlak, “Rough sets,” Int. J. of Computer and Information Sciences, Vol.11, pp. 341-356, 1982.
- [2] Z. Pawlak, “Rough sets. Theoretical aspects of reasoning data,” Kluwer Acad., 1991.
- [3] D. Slezak and W. Ziarko, “Bayesian rough set model,” Proc. of the Int. Workshop on Foundation of Data Mining (FDM2002), pp. 131-135, December 9, Maebashi, Japan, 2002.
- [4] D. V. Nguyen and K. Yamada, “Extended tolerance relation to define a new rough set model in incomplete information systems,” Advances in Fuzzy Systems, Vol.2013 (Article ID 372091), 10 pages, 2013.
- [5] D. V. Nguyen and K. Yamada, “Rough set approach with imperfect data based on Dempster-Shafer theory,” J. Adv. Comput. Intell. Intell. Inform., Vol.18, No.3, pp. 280-288, 2014.
- [6] J. W. Grzymala-Busse, “On the unknown attribute values in learning from examples,” In Z. Ras and M. Zemankova (Eds.), Methodologies for Intelligent Systems, Vol.542, Lecture Notes in Computer Science, pp. 368-377, Springer Berlin Heidelberg, 1991.
- [7] J. Grzymala-Busse and M. Hu, “A comparison of several approaches to missing attribute values in data mining,” In W. Ziarko and Y. Yao, editors, Rough Sets and Current Trends in Computing, Vol.2005, Lecture Notes in Computer Science, pp. 378.385, Springer Berlin Heidelberg, 2001.
- [8] J. Grzymala-Busse, “Characteristic relations for incomplete data: A generalization of the indiscernibility relation,” Proc. of the 3rd Int. Conf. on Rough Sets and Current Trends in Computing, pp. 244-253, 2004.
- [9] J. Grzymala-Busse, “Three approaches to missing attribute values: A rough set perspective,” In T. Lin, Y. Xie, A. Wasilewska, and C.-J. Liau (Eds.), Data Mining: Foundations and Practice, Vol.118, Studies in Computational Intelligence, pp. 139-152, Springer Berlin Heidelberg, 2008.
- [10] J. W. Grzymala-Busse and W. Rzasa, “Definability and other properties of approximations for generalized indiscernibility relations,” In J. Peters and A. Skowron (Eds.), Trans. on Rough Sets XI, Vol.5946, Lecture Notes in Computer Science, pp. 14-39, Springer Berlin Heidelberg, 2010.
- [11] L. Guan and G. Wang, “Generalized approximations defined by non-equivalence relations,” Information Sciences, Vol.193, pp. 163-179, 2012.
- [12] J. Stefanowski and A. Tsoukias, “On the extension of rough sets under incomplete information,” Proc. of New directions in rough sets, data mining and granular-soft computing, pp. 73-82, 1999.
- [13] J. Stefanowski and A. Tsoukias, “Incomplete information tables and rough classication,” Computational Intelligence, Vol.17, pp. 545-566, 2001.
- [14] J. D. Katzberg and W. Ziarko, “Variable precision rough sets with asymmetric bounds,” In Proc. of the Int. Workshop on Rough Sets and Knowledge Discovery: Rough Sets, Fuzzy Sets and Knowledge Discovery, RSKD’93, pp. 167-177, London, UK, Springer-Verlag, 1994.
- [15] M. Kryszkiewicz, “Rough set approach to incomplete information systems,” Information Sciences, Vol.112, No.1-4, pp. 39-49, 1998.
- [16] M. Kryszkiewicz, “Rules in incomplete information systems,” Information Sciences, Vol.113, No.3-4, pp. 271-292, 1999.
- [17] Y. Leung and D. Li, “Maximal consistent block technique for rule acquisition in incomplete information systems,” Information Sciences, Vol.153, pp. 85-106, 2003.
- [18] Y. Leung, W.-Z. Wu, W.-X. Zhang, “Knowledge acquisition in incomplete information systems: A rough set approach,” European J. of Operational Research, Vol.168, No.1, pp. 164-180, 2006.
- [19] M. Nakata and H. Sakai, “Handling missing values in terms of rough sets,” 23rd Fuzzy System Symp., 2007.
- [20] D. Miao, Y. Zhao, Y. Yao, H. Li, and F. Xu, “Relative reducts in consistent and inconsistent decision tables of the Pawlak rough set model,” Information Sciences, Vol.179, No.24, pp. 4140-4150, 2009.
- [21] D. Slezak and W. Ziarko, “Variable precision Bayesian rough set model,” Lecture Notes in Computer Science, Vol.2639, pp. 312-315, 2003.
- [22] D. Slezak and W. Ziarko, “Attribute reduction in the Bayesian version of variable precision rough set model,” Electronic Notes in Theoretical Computer Science, Vol.82, No.4, pp. 263-273, 2003.
- [23] D. Slezak and W. Ziarko, “The investigation of the Bayesian rough set model,” Int. J. Approx. Reasoning, Vol.40, No.1-2, pp. 81-91, 2005.
- [24] G. Wang, “Extension of rough set under incomplete information systems,” Proc. of the 2002 IEEE Int. Conf. on Fuzzy Systems 2002 (FUZZ-IEEE’02), Vol.2, pp. 1098-1103, 2002.
- [25] X. Yang, D. Yu, J. Yand, and X. Song, “Difference relation-based rough set and negative rules in incomplete information system,” Int. J. of Uncertainty, Fuzziness and Knowledge-Based Systems, Vol.17, No.5, pp. 649-665, 2009.
- [26] X. Yang and J. Yang, “Incomplete information system and rough set theory,” Springer-Verlag Berlin Heidelberg (1st ed.), 2012.
- [27] T. Medhat, “Prediction of missing values for decision attribute,” I. J. Information Technology and Computer Science, Vol.4, No.11, pp. 58-66, 2012.
- [28] A. Kusiak, K. Kernstine, J. Kern, K. McLaughlin, and T. Tseng, “Data mining: medical and engineering case studies,” In Industrial Engineering Research Conf., pp. 1-7, 2000.
- [29] X. Jia, L. Shang, B. Zhou, and Y. Yao, “Generalized attribute reduct in rough set theory,” Knowledge-Based Systems, Vol.91, pp. 204-218, 2016.
- [30] W.-X. Zhang, J.-S. Mi, and W.-Z. Wu, “Approaches to knowledge reductions in inconsistent systems,” Int. J. of Intelligent Systems, Vol.18, No.9, pp. 989-1000, 2003.
- [31] J. Grzymala-Busse, “Data with missing attribute values: Generalization of indiscernibility relation and rule induction,” In J. Peters, A. Skowron, J. Grzymaa-Busse, B. Kostek, R. Winiarski, and M. Szczuka (Eds.), Trans. on Rough Sets I, Vol.3100, Lecture Notes in Computer Science, pp. 78-95, Springer Berlin Heidelberg, 2004.