Hard c-Means Using Quadratic Penalty-Vector Regularization for Uncertain Data
Yasunori Endo*, Arisa Taniguchi**, and Yukihiro Hamasuna***
*Faculty of Engineering, Information and Systems, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8573, Japan
**Graduate School of Systems and Information Engineering, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8573, Japan
***Department of Informatics, Kinki University, 3-4-1 Kowakae, Higashiosaka, Osaka 577-8502, Japan
Clustering is an unsupervised classification technique for data analysis. In general, each datum in real space is transformed into a point in a pattern space to apply clustering methods. Data cannot often be represented by a point, however, because of its uncertainty, e.g., measurement error margin and missing values in data. In this paper, we will introduce quadratic penalty-vector regularization to handle such uncertain data using Hard c-Means (HCM), which is one of the most typical clustering algorithms. We first propose a new clustering algorithm called hard c-means using quadratic penalty-vector regularization for uncertain data (HCMP). Second, we propose sequential extraction hard c-means using quadratic penalty-vector regularization (SHCMP) to handle datasets whose cluster number is unknown. Furthermore, we verify the effectiveness of our proposed algorithms through numerical examples.
-  Y. Endo, R. Murata, H. Haruyama, and S. Miyamoto, “Fuzzy c-Means for Data with Tolerance,” Proc. 2005 Int. Symposium on Nonlinear Theory and Its Applications, pp. 345-348, 2005.
-  R. Murata, Y. Endo, H. Haruyam, and S. Miyamoto, “On Fuzzy c-Means for Data with Tolerance,” J. of Advance Computational Intelligence and Intelligent Informatics, Vol.10, No.5, pp. 673-681, 2006.
-  Y. Kanzawa, Y. Endo, and S. Miyamoto, “Fuzzy c-Means Algorithms for Data with Tolerance based on Opposite Criterions,” IEICE Trans. Fundamentals, Vol.E90-A, No.10, pp. 2194-2202, 2007.
-  Y. Endo, Y. Hasegawa, Y. Hamasuna, and S. Miyamoto, “Fuzzy c-Means for Data with Rectangular Maximum Tolerance Range,” J. of Advanced Computational Intelligence and Intelligent Informatics, Vol.12, No.5, pp. 461-466, 2008.
-  Y. Kanzawa, Y. Endo, and S. Miyamoto, “Fuzzy c-Means Algorithms for Data with Tolerance using Kernel Functions,” IEICE Trans. Fundamentals, Vol.E91-A, No.9, pp. 2520-2534, 2008.
-  Y. Hasegawa, Y. Endo, and Y. Hamasuna, “On Fuzzy c-Means for Data with Uncertainty using Spring Modulus,” Proc. of SCIS&ISIS 2008, 2008.
-  Y. Endo, Y. Hasegawa, Y. Hamasuna, and Y. Kanzawa, “Fuzzy c-Means Clustering for uncertain Data using Quadratic Regularization of Penalty Vectors,” J. of Advance Computational Intelligence and Intelligent Informatics, Vol.15, No.1, pp. 76-82, 2011.
-  S. Miyamoto and K. Arai, “Different Sequential Clustering Algorithms and Sequential Regression Models,” Proc. of FUZZ-IEEE 2009, 2009.
-  R. N. Dave, “Characterization and Detection of Noise in Clustering,” Pattern Recognition Letters, Vol.12, pp. 657-664, 1991.
-  R. N. Dave and R. Krishnapuram, “Robust Clustering Methods: a Unified View,” IEEE Trans. on Fuzzy Systems, Vol.5, No.2, pp. 270-293, 1997.
-  J. B. MacQueen, “Some Methods of Classification and Analysis of Multivariate Observations,” Proc. of 5th Berkeley Symposium on Math. Stat. and Prob., pp. 281-297, 1967.