Rough Set-Based Clustering Utilizing Probabilistic Memberships

Seiki Ubukata; Hiroki Kato; Akira Notsu; Katsuhiro Honda

doi:10.20965/jaciii.2018.p0956

single-jc.php

« previous

JACIII Vol.22 No.6 pp. 956-964

doi: 10.20965/jaciii.2018.p0956

(2018)

Paper:

Views over last 60 days: 1,250

Rough Set-Based Clustering Utilizing Probabilistic Memberships

Seiki Ubukata, Hiroki Kato, Akira Notsu, and Katsuhiro Honda

Osaka Prefecture University
1-1 Gakuen-cho, Nakaku, Sakai, Osaka 599-8531, Japan

Received:

February 17, 2018

Accepted:

July 24, 2018

Published:

October 20, 2018

Keywords:

clustering, rough set theory, rough C-means, rough set C-means, rough membership C-means

Abstract

Representing the positive, possible, and boundary regions of clusters, rough set-based C-means clustering methods, such as generalized rough C-means (GRCM) and rough set C-means (RSCM), are promising for analyzing vague cluster shapes and realizing reliable classification. In this study, we consider rough set-based clustering approaches that utilize probabilistic memberships as variants of GRCM and RSCM, including π generalized rough C-means (πGRCM), π rough set C-means (πRSCM), and rough membership C-means (RMCM). πGRCM and πRSCM assign equal probabilities of cluster belonging according to Laplace’s principle of indifference, whereas RMCM assigns the probabilities according to rough memberships, which represent conditional probabilities based on the object’s neighborhood derived from a binary relation. In addition, we discuss the theoretical validity of our RMCM approach and compare it with other methods considered in this study. Furthermore, we conducted numerical experiments for evaluating the classification performances of the abovementioned methods. Based on our experimental results, the methods were found to be effective.

Comparison of the classification performance of rough set-based clustering utilizing probabilistic membership, i.e., π generalized rough C-means (πGRCM), π rough set C-means (πRSCM), and rough membership C-means (RMCM), in Breast Cancer Wisconsin dataset. Our method (RMCM) achieved the best performance.

Cite this article as:

S. Ubukata, H. Kato, A. Notsu, and K. Honda, “Rough Set-Based Clustering Utilizing Probabilistic Memberships,” J. Adv. Comput. Intell. Intell. Inform., Vol.22 No.6, pp. 956-964, 2018.

Data files:

References

[1] J. B. MacQueen, “Some Methods of Classification and Analysis of Multivariate Observations,” in Proc. 5th Berkeley Symp. Math. Stat. Prob., pp. 281-297, 1967.
[2] J. C. Bezdek, “Pattern Recognition with Fuzzy Objective Function Algorithms,” Plenum Press, 1981.
[3] J. C. Dunn, “A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters,” J. of Cybernetics, Vol.3, pp. 32-57, 1974.
[4] R. N. Davé, “Characterization and detection of noise in clustering,” Pattern Recognition Letters, Vol.12, No.11, pp. 657-664, 1991.
[5] R. N. Davé and R. Krishnapuram, “Robust clustering methods: a unified view,” IEEE Trans. on Fuzzy Systems, Vol. 5, pp. 270-293, 1997.
[6] R. Krishnapuram and J. M. Keller, “The possibilistic C-means algorithm: insights and recommendations,” IEEE Trans. on Fuzzy Systems, Vol.4, Issue 3, pp. 385-393, 1996.
[7] Z. Pawlak, “Rough sets,” Int. J. of Computer & Information Sciences, Vol.11, Issue 5, pp. 341-356, 1982.
[8] Z. Pawlak, “Rough classification,” Int. J. of Man-Machine Studies, Vol.20, Issue 5, pp. 469-483, 1984.
[9] Z. Pawlak, “Rough set approach to knowledge-based decision support,” European J. of Operational Research, Vol.99, Issue 1, pp. 48-57, 1997.
[10] Y. Y. Yao, “Generalized rough set models,” Rough Sets in Knowledge Discovery, Physica-Verlag, pp. 286-318, 1998.
[11] P. Lingras and C. West, “Interval Set Clustering of Web Users with Rough K-Means,” J. of Intelligent Information Systems, Vol.23, Issue 1, pp. 5-16, 2004.
[12] G. Peters, “Some refinements of rough k-means clustering,” Pattern Recognition, Vol. 39, Issue 8, pp. 1481-1491, 2006.
[13] S. Ubukata, A. Notsu, and K. Honda, “General Formulation of Rough C-Means Clustering,” Int. J. of Computer Science and Network Security, Vol.17, Issue 9, pp. 1-10, 2017.
[14] S. Mitra, H. Banka, and W. Pedrycz, “Rough-Fuzzy Collaborative Clustering,” IEEE Trans. on Systems, Man, and Cybernetics, Part B (Cybernetics), Vol.36, Issue 4, pp. 795-805, 2006.
[15] S. Mitra and B. Barman, “Rough-Fuzzy Clustering: An Application to Medical Imagery,” Rough Sets and Knowledge Technology, Vol.5009, pp. 300-307, 2008.
[16] P. Maji and S. K. Pal, “RFCM: A Hybrid Clustering Algorithm Using Rough and Fuzzy Sets,” Fundamenta Informaticae, Vol.80, No.4, pp. 475-496, 2007.
[17] Y. Endo and N. Kinoshita, “Various Types of Objective-Based Rough Clustering,” Fuzzy Sets, Rough Sets, Multisets and Clustering, Vol.671, pp. 63-85, 2017.
[18] G. Peters, “Rough clustering utilizing the principle of indifference,” Information Sciences, Vol.277, pp. 358-374, 2014.
[19] G. Peters, “Is there any need for rough clustering?,” Pattern Recognition Letters, Vol.53, pp. 31-37, 2015.
[20] P. S. Laplace, “Philosophical Essay on Probabilities,” Dover Publications, 1951.
[21] J. M. Keynes, “A Treatise on Probability,” Macmillan, 1921.
[22] S. Ubukata, A. Notsu, and K. Honda, “The Rough Set k-Means Clustering,” Proc. of Joint 8th Int. Conf. on Soft Computing and Intelligent Systems and 17th Int. Symp. on Advanced Intelligent Systems, pp. 189-193, 2016.
[23] S. Ubukata, K. Umado, A. Notsu, and K. Honda, “Characteristics of Rough Set C-Means Clustering,” J. Adv. Comput. Intell. Intell. Inform., Vol.22, No.4, pp. 551-564, 2018.
[24] S. Ubukata, A. Notsu, and K. Honda, “The Rough Membership k-Means Clustering,” Proc. of The 5th Int. Symp. on Integrated Uncertainty in Knowledge Modelling and Decision Making, pp. 207-216, 2016.
[25] S. Ubukata, H. Kato, A. Notsu, and K. Honda, “Dependence on Initial Values of the Rough Membership k-Means Clustering,” Proc. of the 18th Int. Symp. on Advanced Intelligent Systems, #T2d-4, pp. 309-316, 2017.
[26] Z. Pawlak and A. Skowron, “Rough membership function: a tool for reasoning with uncertainty,” Algebraic Methods in Logic and Computer Science, Banach Center Publications, Vol.28, pp. 135-150, 1993.
[27] Z. Pawlak and A. Skowron, “Rough membership functions,” Advances in the Dempster-Shafer Theory of Evidence, Wiley, pp. 251-271, 1994.
[28] Q. Hu, D. Yu, and Z. Xie, “Neighborhood classifiers,” Expert Systems with Applications, Vol.34, Issue 2, pp. 866-876, 2008.
[29] Q. Hu, D. Yu, J. Liu, and C. Wu, “Neighborhood rough set based heterogeneous feature subset selection,” Information Sciences, Vol.178, Issue 18, pp. 3577-3594, 2008.
[30] UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/ [accessed March 5, 2018]

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] J. B. MacQueen, “Some Methods of Classification and Analysis of Multivariate Observations,” in Proc. 5th Berkeley Symp. Math. Stat. Prob., pp. 281-297, 1967.

[2] [2] J. C. Bezdek, “Pattern Recognition with Fuzzy Objective Function Algorithms,” Plenum Press, 1981.

[3] [3] J. C. Dunn, “A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters,” J. of Cybernetics, Vol.3, pp. 32-57, 1974.

[4] [4] R. N. Davé, “Characterization and detection of noise in clustering,” Pattern Recognition Letters, Vol.12, No.11, pp. 657-664, 1991.

[5] [5] R. N. Davé and R. Krishnapuram, “Robust clustering methods: a unified view,” IEEE Trans. on Fuzzy Systems, Vol. 5, pp. 270-293, 1997.

[6] [6] R. Krishnapuram and J. M. Keller, “The possibilistic C-means algorithm: insights and recommendations,” IEEE Trans. on Fuzzy Systems, Vol.4, Issue 3, pp. 385-393, 1996.

[7] [7] Z. Pawlak, “Rough sets,” Int. J. of Computer & Information Sciences, Vol.11, Issue 5, pp. 341-356, 1982.

[8] [8] Z. Pawlak, “Rough classification,” Int. J. of Man-Machine Studies, Vol.20, Issue 5, pp. 469-483, 1984.

[9] [9] Z. Pawlak, “Rough set approach to knowledge-based decision support,” European J. of Operational Research, Vol.99, Issue 1, pp. 48-57, 1997.

[10] [10] Y. Y. Yao, “Generalized rough set models,” Rough Sets in Knowledge Discovery, Physica-Verlag, pp. 286-318, 1998.

[11] [11] P. Lingras and C. West, “Interval Set Clustering of Web Users with Rough K-Means,” J. of Intelligent Information Systems, Vol.23, Issue 1, pp. 5-16, 2004.

[12] [12] G. Peters, “Some refinements of rough k-means clustering,” Pattern Recognition, Vol. 39, Issue 8, pp. 1481-1491, 2006.

[13] [13] S. Ubukata, A. Notsu, and K. Honda, “General Formulation of Rough C-Means Clustering,” Int. J. of Computer Science and Network Security, Vol.17, Issue 9, pp. 1-10, 2017.

[14] [14] S. Mitra, H. Banka, and W. Pedrycz, “Rough-Fuzzy Collaborative Clustering,” IEEE Trans. on Systems, Man, and Cybernetics, Part B (Cybernetics), Vol.36, Issue 4, pp. 795-805, 2006.

[15] [15] S. Mitra and B. Barman, “Rough-Fuzzy Clustering: An Application to Medical Imagery,” Rough Sets and Knowledge Technology, Vol.5009, pp. 300-307, 2008.

[16] [16] P. Maji and S. K. Pal, “RFCM: A Hybrid Clustering Algorithm Using Rough and Fuzzy Sets,” Fundamenta Informaticae, Vol.80, No.4, pp. 475-496, 2007.

[17] [17] Y. Endo and N. Kinoshita, “Various Types of Objective-Based Rough Clustering,” Fuzzy Sets, Rough Sets, Multisets and Clustering, Vol.671, pp. 63-85, 2017.

[18] [18] G. Peters, “Rough clustering utilizing the principle of indifference,” Information Sciences, Vol.277, pp. 358-374, 2014.

[19] [19] G. Peters, “Is there any need for rough clustering?,” Pattern Recognition Letters, Vol.53, pp. 31-37, 2015.

[20] [20] P. S. Laplace, “Philosophical Essay on Probabilities,” Dover Publications, 1951.

[21] [21] J. M. Keynes, “A Treatise on Probability,” Macmillan, 1921.

[22] [22] S. Ubukata, A. Notsu, and K. Honda, “The Rough Set k-Means Clustering,” Proc. of Joint 8th Int. Conf. on Soft Computing and Intelligent Systems and 17th Int. Symp. on Advanced Intelligent Systems, pp. 189-193, 2016.

[23] [23] S. Ubukata, K. Umado, A. Notsu, and K. Honda, “Characteristics of Rough Set C-Means Clustering,” J. Adv. Comput. Intell. Intell. Inform., Vol.22, No.4, pp. 551-564, 2018.

[24] [24] S. Ubukata, A. Notsu, and K. Honda, “The Rough Membership k-Means Clustering,” Proc. of The 5th Int. Symp. on Integrated Uncertainty in Knowledge Modelling and Decision Making, pp. 207-216, 2016.

[25] [25] S. Ubukata, H. Kato, A. Notsu, and K. Honda, “Dependence on Initial Values of the Rough Membership k-Means Clustering,” Proc. of the 18th Int. Symp. on Advanced Intelligent Systems, #T2d-4, pp. 309-316, 2017.

[26] [26] Z. Pawlak and A. Skowron, “Rough membership function: a tool for reasoning with uncertainty,” Algebraic Methods in Logic and Computer Science, Banach Center Publications, Vol.28, pp. 135-150, 1993.

[27] [27] Z. Pawlak and A. Skowron, “Rough membership functions,” Advances in the Dempster-Shafer Theory of Evidence, Wiley, pp. 251-271, 1994.

[28] [28] Q. Hu, D. Yu, and Z. Xie, “Neighborhood classifiers,” Expert Systems with Applications, Vol.34, Issue 2, pp. 866-876, 2008.

[29] [29] Q. Hu, D. Yu, J. Liu, and C. Wu, “Neighborhood rough set based heterogeneous feature subset selection,” Information Sciences, Vol.178, Issue 18, pp. 3577-3594, 2008.

[30] [30] UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/ [accessed March 5, 2018]