single-jc.php

JACIII Vol.22 No.4 pp. 524-536
doi: 10.20965/jaciii.2018.p0524
(2018)

Paper:

# Fuzzy Clustering Methods for Categorical Multivariate Data Based on q-Divergence

## Tadafumi Kondo and Yuchi Kanzawa

Shibaura Institute of Technology
3-7-5 Toyosu, Koto, Tokyo 135-8548, Japan

December 15, 2017
Accepted:
February 8, 2018
Published:
July 20, 2018
Keywords:
fuzzy clustering, categorical multivariate data, KL-divergence, q-divergence
Abstract

This paper presents two fuzzy clustering algorithms for categorical multivariate data based on q-divergence. First, this study shows that a conventional method for vectorial data can be explained as regularizing another conventional method using q-divergence. Second, based on the known results that Kullback-Leibler (KL)-divergence is generalized into the q-divergence, and two conventional fuzzy clustering methods for categorical multivariate data adopt KL-divergence, two fuzzy clustering algorithms for categorical multivariate data that are based on q-divergence are derived from two optimization problems built by extending the KL-divergence in these conventional methods to the q-divergence. Through numerical experiments using real datasets, the proposed methods outperform the conventional methods in term of clustering accuracy.

Tadafumi Kondo and Yuchi Kanzawa, “Fuzzy Clustering Methods for Categorical Multivariate Data Based on q-Divergence,” J. Adv. Comput. Intell. Intell. Inform., Vol.22, No.4, pp. 524-536, 2018.
Data files:
References
1. [1] J. B. MacQueen, “Some Methods for Classification and Analysis of Multivariate Observations,” Proc. 5th Berkeley Symp. on Math. Stat. and Prob., pp. 281-297, 1967.
2. [2] J. Bezdek, “Pattern Recognition with Fuzzy Objective Function Algorithms,” Plenum Press, 1981.
3. [3] S. Miyamoto and M. Mukaidono, “Fuzzy c-Means as a Regularization and Maximum Entropy Approach,” Proc. 7th Int. Fuzzy Systems Association World Congress (IFSA’97), Vol.2, pp. 86-92, 1997.
4. [4] S. Miyamoto and N. Kurosawa, “Controlling Cluster Volume Sizes in Fuzzy c-means Clustering,” Proc. SCIS&ISIS2004, pp. 1-4, 2004.
5. [5] H. Ichihashi, K. Honda, and N. Tani, “Gaussian Mixture PDF Approximation and Fuzzy c-means Clustering with Entropy Regularization,” Proc. 4th Asian Fuzzy System Symp., pp. 217-221, 2000.
6. [6] S. Miyamoto, H. Ichihashi, and K. Honda, “Algorithms for Fuzzy Clustering,” Springer, 2008.
7. [7] L. Rigouste, O. Cappé, and F. Yvon, “Inference and evaluation of the multinomial mixture model for text clustering,” Information Processing and Management, Vol.43, No.5, pp. 1260-1280, 2007.
8. [8] K. Honda, S. Oshio, and A. Notsu, “FCM-type Fuzzy Co-Clustering by K-L Information Regularization,” Proc. FUZZ-IEEE2014, pp. 2505-2510, 2014.
9. [9] K. Honda, S. Oshio, and A. Notsu, “Fuzzy Co-Clustering Induced by Multinomial Mixture Models,” J. Adv. Comput. Intell. Intell. Inform., Vol.19, No.6, pp. 717-726, 2015.
10. [10] H. Chernoff, “A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on a Sum of Observations,” Ann. Math. Statist., Vol.23, pp. 493-507, 1952.
11. [11] Lise’s Inquisitive Students, Machine Learning Research Group @UMD, http://www.cs.umd.edu/sen/lbc-proj/LBC.html [accessed November 1, 2017]
12. [12] V. Tunali, PRETO Data Mining Research, http://www.dataminingresearch.com/ [accessed November 1, 2017]
13. [13] G. Karypis, Karypis Lab, CLUTO – Software for Clustering High-Dimensional Datasets, http://glaros.dtc.umn.edu/ [accessed November 1, 2017]
14. [14] Text REtrieval Conf. (TREC), http://trec.nist.gov [accessed November 1, 2017]
15. [15] TREC: Text REtrieval Conference Relevance Judgments, http://trec.nist.gov/data/qrels_eng/index.html [accessed November 1, 2017]
16. [16] D. Boley et al., “Partitioning-based clustering for web document categorization,” Decision Support Systems, Vol.27, No.3, pp. 329-341, 1999.
17. [17] D. D. Lewis, “Reuters-21578 text categorization test collection distribution 1.0,” http://www.daviddlewis.com/resources/testcollections/reuters21578/ [accessed November 1, 2017]
18. [18] L. Hubert and P. Arabie, “Comparing Partitions,” J. of Classification, Vol.2, pp. 193-218, 1985.
19. [19] D. Arthur and S. Vassilvitskii, “k-means++: the advantages of careful seeding,” Proc. the 8th Annual ACM-SIAM Symp. on Discrete Algorithms, pp. 1027-1035, 2007.
20. [20] C.-H. Oh, K. Honda, and H. Ichihashi, “Fuzzy Clustering for Categorical Multivariate Data,” Proc. Joint 9th IFSA World Congress and 2nd NAFIPS Int. Conf., pp. 2154-2159, 2001.
21. [21] K. Kummamuru, A. Dhawale, and R. Krishnapuram, “Fuzzy Co-clustering of Documents and Keywords,” Proc. IEEE Int. Conf. on Fuzzy Sys., Vol.2, pp. 772-777, 2003.
22. [22] Y. Kanzawa, “Fuzzy Co-Clustering Algorithms Based on Fuzzy Relational Clustering and TIBA Imputation,” J. Adv. Comput. Intell. Intell. Inform., Vol.18, No.2, pp. 182-189, 2014.
23. [23] Y. Kanzawa, “Bezdek-Type Fuzzified Co-Clustering Algorithm,” J. Adv. Comput. Intell. Intell. Inform., Vol.19, No.6, pp. 852-860, 2015.
24. [24] I. S. Dhillon and D. S. Modha, “Concept Decompositions for Large Sparse Text Data Using Clustering,” Machine Learning, Vol.42, pp. 143-175, 2001.
25. [25] S. Miyamoto and K. Mizutani, “Fuzzy Multiset Model and Methods of Nonlinear Document Clustering for Information Retrieval,” LNCS, Vol.3131, pp. 273-283, 2004.
26. [26] K. Mizutani, R. Inokuchi, and S. Miyamoto, “Algorithms of Nonlinear Document Clustering based on Fuzzy Set Model,” Int. J. of Intel. Sys., Vol.23, No.2, pp. 176-198, 2008.
27. [27] Y. Kanzawa, “On Kernelization for a Maximizing Model of Bezdek-like Spherical Fuzzy c-means Clustering,” LNCS, Vol.8825, pp. 108-121, 2014.
28. [28] Y. Kanzawa, “A Maximizing Model of Bezdek-like Spherical Fuzzy c-Means,” Clustering,” J. Adv. Comput. Intell. Intell. Inform., Vol.19, No.5, pp. 662-669, 2015.
29. [29] Y. Kanzawa, “A Maximizing Model of Spherical Bezdek-type Fuzzy Multi-medoids Clustering,” J. Adv. Comput. Intell. Intell. Inform., Vol.19, No.6, pp. 738-746, 2015.
30. [30] A. Cichocki and S. Amari, “Families of Alpha- Beta- and Gamma- Divergences: Flexible and Robust Measures of Similarities,” Entropy, Vol.12 No.6, pp. 1532-1568, 2010.
31. [31] A. Cichocki, S. Cruces, and S. Amari, “Generalized Alpha-Beta Divergences and Their Application to Robust Nonnegative Matrix Factorization,” Entropy, Vol.13, No.1, pp. 134-170, 2011.
32. [32] R. Krishnapuram and J. M. Keller, “A Possibilistic Approach to Clustering,” IEEE Trans. Fuzzy Syst., Vol.1, pp. 98-110, 1993.
33. [33] Y. Kanzawa, “On Possibilistic Clustering Methods Based on Shannon/Tsallis-Entropy for Spherical Data and Categorical Multivariate Data,” Lecture Notes in Artificial Intelligence, Vol.9321, pp. 115-128, 2015.

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Jun. 30, 2022