On Agglomerative Hierarchical Clustering Using Clusterwise Tolerance Based Pairwise Constraints
Yukihiro Hamasuna*, Yasunori Endo**,
and Sadaaki Miyamoto**
*Department of Informatics, School of Science and Engineering, Kinki University, 3-4-1 Kowakae, Higashi-Osaka, Osaka 577-8502, Japan
**Department of Risk Engineering, Faculty of Systems and Information Engineering, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8573, Japan
This paper presents semi-supervised agglomerative hierarchical clustering algorithm using clusterwise tolerance based pairwise constraints. In semi-supervised clustering, pairwise constraints, that is, must-link and cannot-link, are frequently used in order to improve clustering properties. From that sense, we will propose another way named clusterwise tolerance based pairwise constraints to handle must-link and cannot-link constraints in L2-space. In addition, we will propose semi-supervised agglomerative hierarchical clustering algorithm based on it. We will, moreover, show the effectiveness of the proposed method through numerical examples.
-  “Semi-Supervised Learning,” O. Chapelle, B. Schoölkopf, and A. Zien (Eds.), MIT Press, 2006.
-  J. C. Bezdek, “Pattern Recognition with Fuzzy Objective Function Algorithms,” Plenum Press, New York, 1981.
-  S. Miyamoto, H. Ichihashi, and K. Honda, “Algorithms for Fuzzy Clustering,” Springer, Heidelberg, 2008.
-  K. Wagstaff, C. Cardie, S. Rogers, and S. Schroedl, “Constrained k-Means Clustering with Background Knowledge,” Proc. of the 18th Int. Conf. on Machine Learning (ICML 2001), pp. 577-584, 2001.
-  S. Basu, A. Banerjee, and R. J. Mooney, “Active Semi-Supervision for Pairwise Constrained Clustering,” Proc. of the SIAM Int. Conf. on Data Mining (SDM 2004), pp. 333-344, 2004.
-  S. Basu, M. Bilenko, and R. J. Mooney, “A Probabilistic Framework for Semi-Supervised Clustering,” Proc. of the 10th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD 2004), pp. 59-68, 2004.
-  S. Miyamoto, M. Yamazaki, and A. Terami, “On Semi-Supervised Clustering with Pairwise Constraints,” Proc. of The 7th Int. Conf. on Modeling Decisions for Artificial Intelligence (MDAI 2009), pp. 245-254, 2009. (CD-ROM)
-  B. Yan and C. Domeniconi, “An Adaptive Kernel Method for Semi-Supervised Clustering,” Proc. of 17th European Conf. on Machine Learning (ECML 2006), pp. 521-532, 2006.
-  B. Kulis, S. Basu, I. Dhillon, and R. Mooney, “Semi-Supervised Graph Clustering: a Kernel Approach,” Machine Learning, Vol.74, No.1, pp. 1-22, 2009.
-  L. Talavera and J. Béjar, “Integrating Declarative Knowledge in Hierarchical Clustering Tasks,” Proc. of the Third Int. Symp. on Advances in Intelligent Data Analysis (IDA’99), pp. 211-222, 1999.
-  D. Klein, S. Kamvar, and C. Manning, “From Instance-Level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering,” Proc. of the 19th Int. Conf. on Machine Learning (ICML 2002), pp. 307-314, 2002.
-  I. Davidson and S. S. Ravi, “Agglomerative Hierarchical Clustering with Constraints: Theoretical and Empirical Results,” Proc. of 9th European Conf. on Principles and Practice of Knowledge Discovery in Databases (KDD 2005), pp. 59-70, 2005.
-  Y. Hamasuna, Y. Endo, and S. Miyamoto, “On Tolerant Fuzzy c-Means,” J. of Advanced Computational Intelligence and Intelligent Informatics (JACIII), Vol.13, No.4, pp. 421-427, 2009.
-  Y. Endo, R. Murata, H. Haruyama, and S. Miyamoto, “Fuzzy c-Means for Data with Tolerance,” Proc. of Int. Symp. on Nonlinear Theory and Its Applications (Nolta’05), pp. 345-348, 2005.
-  Y. Hamasuna, Y. Endo, and S. Miyamoto, “Semi-Supervised Fuzzy c-Means Clustering Using Clusterwise Tolerance Based Pairwise Constraints,” Proc. of 2010 IEEE Int. Conf. on Granular Computing (GrC2010), pp. 188-193, 2010.
-  Y. Hamasuna and Y. Endo, “Semi-Supervised Fuzzy c-Means Clustering for Data with Clusterwise Tolerance with Pairwise Constraints,” Joint 5th Int. Conf. on Soft Computing and Intelligent Systems and 11th Int. Symp. on Advanced Intelligent Systems (SCIS & ISIS 2010), pp. 397-400, 2010.
-  S. Miyamoto, “Fuzzy Sets in Information Retrieval and Cluster Analysis,” Kluwer Dordrecht, 1990.
-  S. Miyamoto, “Introduction to Cluster Analysis: Theory and Applications of Fuzzy Clustering,” Morikita-Shuppan, Tokyo, 1999. (in Japanse)
-  L. Hubert and P. Arabie, “Comparing Partitions,” J. of Classification, Vol.2, No.1, pp. 193-218, 1985.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.