Learning Similarity Matrix from Constraints of Relational Neighbors

Masayuki Okabe; Seiji Yamada

doi:10.20965/jaciii.2010.p0402

single-jc.php

« previous

JACIII Vol.14 No.4 pp. 402-407

doi: 10.20965/jaciii.2010.p0402

(2010)

Paper:

Views over last 60 days: 875

Learning Similarity Matrix from Constraints of Relational Neighbors

Masayuki Okabe^* and Seiji Yamada^**

^*Information and Media Center, Toyohashi University of Technology, 1-1 Tenpaku, Toyohashi, Aichi 441-8580, Japan

^**National Institute of Informatics, the Graduate University for Advanced Studies (SOKENDAI), 2-1-2 Hitotsubashi, Chiyoda, Tokyko 101-8430, Japan

Received:

December 15, 2009

Accepted:

March 8, 2010

Published:

May 20, 2010

Keywords:

learning similarity matrix, semidefinite programming, constrained clustering

Abstract

This paper describes a method of learning similarity matrix from pairwise constraints assumed used under the situation such as interactive clustering, where we can expect little user feedback. With the small number of pairwise constraints used, our method attempts to use additional constraints induced by the affinity relationship between constrained data and their neighbors. The similarity matrix is learned by solving an optimization problem formalized as semidefinite programming. Additional constraints are used as complementary in the optimization problem. Results of experiments confirmed the effectiveness of our proposed method in several clustering tasks and that our method is a promising approach.

Cite this article as:

M. Okabe and S. Yamada, “Learning Similarity Matrix from Constraints of Relational Neighbors,” J. Adv. Comput. Intell. Intell. Inform., Vol.14 No.4, pp. 402-407, 2010.

Data files:

References

[1] W. Wu, C. Yu, A. Doan, and W. Meng, “An interactive clusteringbased approach to integrating source query interfaces on the deep Web,” In Proc. of the 2004 ACM SIGMOD Int. Conf. on Management of data, pp. 95-106, 2004.
[2] D. Lewis and W. Gale, “A sequential algorithm for training text classifiers,” In Proc. of the Seventeenth ACM-SIGIR Conf., pp. 3-12, 1994.
[3] S. Basu, I. Davidson, and K. L. Wagstaff, “Constrained Clustering,” CRC Press, 2008.
[4] O. Chapelle, B. Scholkopf, and A. Zien, “Semi-Supervised Learning,” The MIT Press, 2006.
[5] K. Wagstaff and S. Roger, “Constrained K-means Clustering with Background Knowledge,” In Proc. of the 18th Int. Conf. on Machine Learning, pp. 577-584, 2001.
[6] D. Klein, S. D. Kamvar, and C. D. Manning, “From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering,” In Proc. of the 19th Int. Conf. on Machine Learning, pp. 307-314, 2002.
[7] S. Shwartz, Y. Singer, and A. Y. Ng, “Online and Batch Learning of Pseudo-Metrics,” In Proc. of the 21st Int. Conf. on Machine Learning, pp. 94-101, 2004.
[8] J. Davis et al., “Information-Theoretic Metric Learning,” In Proc. of the 24th Int. Conf. on Machine Learning, pp. 209-216, 2007.
[9] W. Tang, H. Xiong, S. Zhong, and J. Wu, “Enhancing Semi-Supervised Clustering: A Feature Projection Perspective,” In Proc. of the 13th Int. Conf. on Knowledge Discovery and Data Mining, pp. 707-716, 2007.
[10] S. C. H. Hoi, R. Jin, and M. R. Lyu, “Learning Nonparametric Kernel Matrices from Pairwise Constraints,” In Proc. of the 24th Int. Conf. on Machine Learning, pp. 361-368, 2007.
[11] Z. Li, J. Liu, and X. Tang, “Pairwise Constraint Propagation by Semidefinite Programming for Semi-supervised Classification,” In Proc, of the 25th Int. Conf. on Machine Learning, pp. 576-583, 2008.
[12] K. Fujisawa, M. Kojima, and K. Nakata, “Exploiting sparsity in primal-dual interior-point methods for semidefinite programming,” Mathematical Programming, Series B 79, pp. 235-253, 1997.
[13] I. S. Dhillon, “Co-clustering documents and words using bipartite spectral graph partitioning,” In Proc. of the seventh ACM SIGKDD Conf., pp. 269-274, 2001.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] W. Wu, C. Yu, A. Doan, and W. Meng, “An interactive clusteringbased approach to integrating source query interfaces on the deep Web,” In Proc. of the 2004 ACM SIGMOD Int. Conf. on Management of data, pp. 95-106, 2004.

[2] [2] D. Lewis and W. Gale, “A sequential algorithm for training text classifiers,” In Proc. of the Seventeenth ACM-SIGIR Conf., pp. 3-12, 1994.

[3] [3] S. Basu, I. Davidson, and K. L. Wagstaff, “Constrained Clustering,” CRC Press, 2008.

[4] [4] O. Chapelle, B. Scholkopf, and A. Zien, “Semi-Supervised Learning,” The MIT Press, 2006.

[5] [5] K. Wagstaff and S. Roger, “Constrained K-means Clustering with Background Knowledge,” In Proc. of the 18th Int. Conf. on Machine Learning, pp. 577-584, 2001.

[6] [6] D. Klein, S. D. Kamvar, and C. D. Manning, “From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering,” In Proc. of the 19th Int. Conf. on Machine Learning, pp. 307-314, 2002.

[7] [7] S. Shwartz, Y. Singer, and A. Y. Ng, “Online and Batch Learning of Pseudo-Metrics,” In Proc. of the 21st Int. Conf. on Machine Learning, pp. 94-101, 2004.

[8] [8] J. Davis et al., “Information-Theoretic Metric Learning,” In Proc. of the 24th Int. Conf. on Machine Learning, pp. 209-216, 2007.

[9] [9] W. Tang, H. Xiong, S. Zhong, and J. Wu, “Enhancing Semi-Supervised Clustering: A Feature Projection Perspective,” In Proc. of the 13th Int. Conf. on Knowledge Discovery and Data Mining, pp. 707-716, 2007.

[10] [10] S. C. H. Hoi, R. Jin, and M. R. Lyu, “Learning Nonparametric Kernel Matrices from Pairwise Constraints,” In Proc. of the 24th Int. Conf. on Machine Learning, pp. 361-368, 2007.

[11] [11] Z. Li, J. Liu, and X. Tang, “Pairwise Constraint Propagation by Semidefinite Programming for Semi-supervised Classification,” In Proc, of the 25th Int. Conf. on Machine Learning, pp. 576-583, 2008.

[12] [12] K. Fujisawa, M. Kojima, and K. Nakata, “Exploiting sparsity in primal-dual interior-point methods for semidefinite programming,” Mathematical Programming, Series B 79, pp. 235-253, 1997.

[13] [13] I. S. Dhillon, “Co-clustering documents and words using bipartite spectral graph partitioning,” In Proc. of the seventh ACM SIGKDD Conf., pp. 269-274, 2001.

Learning Similarity Matrix from Constraints of Relational Neighbors

Masayuki Okabe* and Seiji Yamada**

Masayuki Okabe^* and Seiji Yamada^**