A k-Anonymous Rule Clustering Approach for Data Publishing
Motoyuki Ohki and Masahiro Inuiguchi
1-3 Machikaneyama, Toyonaka, Osaka 560-8531, Japan
Classification rules should be open for public inspection to ensure fairness.
These rules can be originally induced from some dataset. If induced classification rules are supported only by a small number of objects in the dataset, publication can lead to identification of objects supporting the rule, given their speciality. Eventually, it is possible to retrieve information about the identified objects. This identifiability is not desirable in terms of data privacy.
In this paper, to avoid such privacy breaches, we propose rule clustering for achieving k-anonymity of all induced rules, i.e., the induced rules are supported by at least k objects in the dataset. The proposed approach merges similar rules to satisfy k-anonymity while aiming to maintain the classification accuracy. Two numerical experiments were executed to verify both the accuracy of the classifier with the rules obtained by the proposed method and the ratio of decision classes revealed from leaked information about objects. The experimental results show the usefulness of the proposed method.
-  L. Sweeney, “k-Anonymity: A Model for Protecting Privacy,” Int. J. on Uncertainty Fuzziness and Knowledge-based System, Vol.10, No.5, pp. 557-570, 2002.
-  R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules,” VLDB ’94 Proc. of the 20th Int. Conf. on Very Large Data Bases, pp. 487-499, 1994.
-  C. H. Tai, P. S.Yu, and M. S. Chen, “k-Support Anonymity Based on Pseudo Taxonomy for Outsourcing of Frequent Itemset Mining,” Proc. of the 16th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 473-482, 2010.
-  Z. Zhu and W. Du, “K-anonymous association rule hiding,” Proc. of the 5th ACM Symp. on Information, Computer and Communications Security, pp. 305-309, 2010.
-  J. G. Narges and N. D. Mohammad, “A survey on privacy preserving association rule mining,” Advances in Computer Science: an Int. J., Vol.4, No.14, pp. 41-48, 2015.
-  B. J. Khyati, V. Jignesh, and R. P. Dhiren, “A Survey on Association Rule Hiding Methods,” Int. J. of Computer Application, Vol.82, No.13, pp. 20-25, 2013.
-  Z. Pawlak, “Rough Sets,” Int. J. of Computer and Information Sciences, Vol.11, No.5, pp. 341-356, 1982.
-  N. Ytow, D. R. Morse, and D. McL. Roberts, “Rough Set Approximation as Formal Concept,” J. Adv. Comput. Intell. Intell. Inform., Vol.10, No.5, pp. 606-611, 2006.
-  N. Yamaguchi, M. Wu, M. Nakata, and H. Sakai, “Application of Rough Set-Based Information Analysis to Questionnaire Data,” J. Adv. Comput. Intell. Intell. Inform., Vol.18, No.6, pp. 953-961, 2014.
-  M. Inuiguchi and K. Washimi, “Improving Rough Set Rule-Based Classification by Supplementary Rules,” J. Adv. Comput. Intell. Intell. Inform., Vol.19, No.6, pp. 747-758, 2015.
-  M. Ohki, E. Sekiya, and M. Inuiguchi, “Role of Robustness Measure in Rule Induction,” J. Adv. Comput. Intell. Intell. Inform., Vol.20, No.4, pp. 580-589, 2016.
-  M. Inuiguchi, T. Hamakawa, and S. Ubukata, “Imprecise Rules for Data Privacy,” Rough Sets and Knowledge Technology 10th Int. Conf. RSKT 2015, Vol.11, pp. 129-139, 2015.
-  A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam, “l-diversity : Privacy beyond k-anonymity,” ACM Trans. on Knowledge Discovery from Data, Vol.1, No.3, pp. 1-12, 2007.
-  B. Ji-Won, K. Ashish, B. Elisa, and L Ninghui, “Efficient k-Anonymization Using Clustering Techniques,” 12th Int. Conf. on Database Systems for Advanced Applications, Vol.4443, pp. 188-200, 2007.
-  A. Kawano, K. Honda, H. Kasugai, and A. Notsu, “A Greedy Algorithm for k-Member Co-clustering and its Applicability to Collaborative Filtering,” 17th Int. Conf. in Knowledge Based and Intelligent Information and Engineering Systems, Vol.22, pp. 477-484, 2013.
-  W. Ziarko, “Variable Precision Rough Set Model,” J. of Computer and System Sciences, Vol.46, No.1, pp. 39-59, 1993.
-  N. Shan and W. Ziarko, “Data-based acquisition and incremental modification of classification rules,” Computational Intelligence, Vol.11, pp. 357-370, 1995.
-  J. W. Grzymala-Busse, “MLEM2 - Discretization During Rule Induction,” Proc. of the IIPWM2003, pp. 499-508, 2003.
-  J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” Proc. of the ACM SIGMOD Conf. on Management of Data, pp. 1-12, 2000.
-  J. W. Grzymala-Busse, “LERS – A system for learning from examples based on rough sets,” Intelligent Decision Support: Handbook of Applications and Advance of the Rough Sets Theory, Kluwer Academic Publishers, 1992.