Paper:
Fuzzified Even-Sized Clustering Based on Optimization
Kei Kitajima*, Yasunori Endo**, and Yukihiro Hamasuna***
*Department of Risk Engineering, Graduate School of Systems and Information Engineering, University of Tsukuba
1-1-1 Tennodai, Tsukuba, Ibaraki 305-8573, Japan
**Faculty of Engineering, Information and Systems, University of Tsukuba
1-1-1 Tennodai, Tsukuba, Ibaraki 305-8573, Japan
***Department of Informatics, School of Science and Engineering, Kindai University
3-4-1 Kowakae, Higashiosaka, Osaka 577-8502, Japan
Clustering is a method of data analysis without the use of supervised data. Even-sized clustering based on optimization (ECBO) is a clustering algorithm that focuses on cluster size with the constraints that cluster sizes must be the same. However, this constraints makes ECBO inconvenient to apply in cases where a certain margin of cluster size is allowed. It is believed that this issue can be overcome by applying a fuzzy clustering method. Fuzzy clustering can represent the membership of data to clusters more flexible. In this paper, we propose a new even-sized clustering algorithm based on fuzzy clustering and verify its effectiveness through numerical examples.
- [1] J. B. MacQueen, “Some Methods of Classification and Analysis of Multivariate Observations,” Proc. of the Fifth Berkeley Symp. on Mathematical Statictics and Probability, pp. 281-297, 1967.
- [2] J. C. Bezdek, “Pattern Recognition with Fuzzy Objective Function Algorithms,” Springer, 1981.
- [3] S. Miyamoto, H. Ichihashi, and K. Honda, “Algorithms for Fuzzy Clustering,” Springer, 2008.
- [4] H. Ichihashi, K. Honda, and N. Tani, “Gaussian Mixture PDF Approximation and Fuzzy c-means Clustering with Entropy Regularization,” Proc. of the 4th Asian Fuzzy Systems Symp., pp. 217-221, 2000.
- [5] J.-W. Byun et al., “Efficient k-Anonymization Using Clustering Techniques,” Proc. of the 12th Int. Conf. on Database Systems for Advanced Applications (DASFAA 2007), pp. 188-200, 2007.
- [6] J.-L. Lin and M.-C. Wei, “An efficient clustering method for k-anonymization,” Proc. of the 2008 Int. Workshop on Privacy and Anonymity in Information Society (PAIS ’08), pp. 46-50, 2008.
- [7] X. He et al., “Clustering-Based k-Anonymity,” Proc. of the 16th Pacific-Asia Conf. on Advances in Knowledge Discovery and Data Mining (PAKDD 2012), pp. 405-417, 2012.
- [8] S. Miyamoto and N. Kurosawa, “Controlling Cluster Volume Sizes in Fuzzy c-means Clustering,” Proc. of Joint 2nd Int. Conf. on Soft Computing and Intelligent Systems and 5th Int. Symp. on Advanced Intelligent Systems (SCIS & ISIS 2004), pp. 1-4, 2004.
- [9] V. Torra, “Fuzzy microaggregation for the transparency principle,” J. of Applied Logic, Vol.23, pp. 70-80, 2017.
- [10] V. Torra, “Data Privacy: Foundations, New Developments and the Big Data Challenge,” Springer, 2017.
- [11] T. Hirano et al., “On Even-sized Clustering Algorithm Based on Optimization,” Proc. of Joint 7th Int. Conf. on Soft Computing and Intelligent Systems and 15th Int. Symp. on Advanced Intelligent Systems (SCIS & ISIS 2014), 2014.
- [12] Y. Endo et al., “On Various Types of Even-sized Clustering Based on Optimization,” The 13th Int. Conf. on Modeling Decisions for Artificial Intelligence (MDAI 2016), pp. 165-177, 2016.
- [13] Y. Endo, S. Ishida, and N. Kinoshita, “Controlled-sized Clustering Based on Optimization,” Joint 17th World Congress of Int. Fuzzy Systems Association and 9th Int. Conf. on Soft Computing and Intelligent Systems (IFSA-SCIS 2017), 2017.
- [14] Y. Endo et al., “On Various Types of Controlled-sized Clustering Based on Optimization,” 2017 IEEE Int. Conf. of Fuzzy Systems (FUZZ-IEEE 2017), 2017.
- [15] M. Kojima, S. Mizuno, and A. Yoshise, “A Primal-Dual Interior Point Algorithm for Linear Programming,” Progress in Mathematical Programming, Interior Point and Related Methods, Springer, pp. 29-47, 1989.
- [16] L. Hubert and P. Arabie, “Comparing Partitions,” J. of Classification, Vol.2, pp. 193-218, 1985.
- [17] O. Mangasarian and W. Wolberg, “Cancer diagnosis via linear programming,” SIAM News, Vol.23, pp. 1-18, 1990.
- [18] D. Arthur and S. Vassilvitskii, “k-means++: The Advantages of Careful Seeding,” Proc. of the 18th Annual ACM-SIAM Symp. on Discrete Algorithms, Society for Industrial and Applied Mathematics Philadelphia, pp. 1027-1035, 2007.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.