Label Propagation for Text Classification Using Latent Topics

Akiko Eriguchi; Ichiro Kobayashi

doi:10.20965/jaciii.2014.p0818

single-jc.php

« previous

JACIII Vol.18 No.5 pp. 818-822

(2014)

doi: 10.20965/jaciii.2014.p0818

Paper:

Views over last 60 days: 1,387

Label Propagation for Text Classification Using Latent Topics

Akiko Eriguchi and Ichiro Kobayashi

Advanced Sciences, Graduate School of Humanities and Sciences, Ochanomizu University, 2-1-1 Otsuka, Bunkyo-ku, Tokyo 112-8610, Japan

Received:

December 15, 2013

Accepted:

June 1, 2014

Published:

September 20, 2014

Keywords:

graph-based semi-supervised learning, label propagation, text classification, latent Dirichlet allocation

Abstract

The objective of this paper is to raise the accuracy of multiclass text classification through Graph-Based Semi-Supervised Learning (GBSSL). In GBSSL, it is essential to construct a proper graph which expresses the relation among nodes. We propose a method to construct a similarity graph by employing both surface information and latent information to express similarity between nodes. Experimenting on a Reuters-21578 corpus, we have confirmed that our proposal works well in raising the accuracy of GBSSL in a multiclass text classification task.

Cite this article as:

A. Eriguchi and I. Kobayashi, “Label Propagation for Text Classification Using Latent Topics,” J. Adv. Comput. Intell. Intell. Inform., Vol.18 No.5, pp. 818-822, 2014.

Data files:

References

[1] H. Scudder, “Probability of error of some adaptive patternrecognition machines,” IEEE Trans. on Information Theory, Vol.11, No.3, pp. 363-371, 1965.
[2] A. Blum and T. Mitchell, “Combining labeled and unlabeled data with co-training,” In Proc. of the eleventh Annual Conf. on Computational Learning Theory, pp. 92-100, 1998.
[3] T. Joachims, “Transductive Inference for Text Classification using Support Vector Machines,” In Proc. of the Sixteenth Int. Conf. on Machine Learning, pp. 200-209, 1999.
[4] A. Subramanya and J. Bilmes, “Soft-Supervised Learning for Text Classification,” In Proc. of the 2008 Conf. on Empirical Methods in Natural Language Processing, pp. 1090-1099, 2008.
[5] D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf, “Learning with Local and Global Consistency,” Advances in Neural Information Processing Systems, Vol.16, pp. 321-328, 2004.
[6] X. Zhu, Z. Ghahramani, and J. Lafferty, “Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions,” In Proc. of the Twentieth Int. Conf. on Machine Learning, pp. 912-919, 2003.
[7] X. Zhu, “Semi-supervised Learning with Graphs,” Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA, USA, 2005.
[8] Q. Gu and J. Han, “Towards Active Learning on Graphs: An Error Bound Minimization Approach,” IEEE Int. Conf. on Data Mining, pp. 882-887, 2012.
[9] T. Jebara, J. Wang, and S.-F. Chang, “Graph construction and bmatching for semi-supervised learning,” In Proc. of the 26th Annual Int. Conf. on Machine Learning, pp. 441-448, 2009.
[10] K. Ozaki, M. Shimbo, M. Komachi, and Y. Matsumoto, “Using the mutual k-nearest neighbor graphs for semi-supervised classification of natural language Data,” In Proc. of the Fifteenth Conf. on Computational Natural Language Learning, pp. 154-162, 2011.
[11] G. Salton and M. J. McGill, “Introduction to Modern Information Retrieval,” McGraw-Hill, 1983.
[12] D. M. Blei, A. Y. Ng, andM. I. Jordan, “Latent Dirichlet allocation,” Machine Learning Research, Vol.3, pp. 993-1022, 2003.
[13] A. B. Goldberg and X. Zhu, “Seeing stars when there aren’t many stars: graph-based semi-supervised learning for sentiment categorization,” In Proc. of HLT-NAACL 2006 Workshop on TextGraphs: Graph-based Algorithms for Natural Language Processing, pp. 45-52, 2006.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[B1] [1] H. Scudder, “Probability of error of some adaptive patternrecognition machines,” IEEE Trans. on Information Theory, Vol.11, No.3, pp. 363-371, 1965.

[B2] [2] A. Blum and T. Mitchell, “Combining labeled and unlabeled data with co-training,” In Proc. of the eleventh Annual Conf. on Computational Learning Theory, pp. 92-100, 1998.

[B3] [3] T. Joachims, “Transductive Inference for Text Classification using Support Vector Machines,” In Proc. of the Sixteenth Int. Conf. on Machine Learning, pp. 200-209, 1999.

[B4] [4] A. Subramanya and J. Bilmes, “Soft-Supervised Learning for Text Classification,” In Proc. of the 2008 Conf. on Empirical Methods in Natural Language Processing, pp. 1090-1099, 2008.

[B5] [5] D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf, “Learning with Local and Global Consistency,” Advances in Neural Information Processing Systems, Vol.16, pp. 321-328, 2004.

[B6] [6] X. Zhu, Z. Ghahramani, and J. Lafferty, “Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions,” In Proc. of the Twentieth Int. Conf. on Machine Learning, pp. 912-919, 2003.

[B7] [7] X. Zhu, “Semi-supervised Learning with Graphs,” Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA, USA, 2005.

[B8] [8] Q. Gu and J. Han, “Towards Active Learning on Graphs: An Error Bound Minimization Approach,” IEEE Int. Conf. on Data Mining, pp. 882-887, 2012.

[B9] [9] T. Jebara, J. Wang, and S.-F. Chang, “Graph construction and bmatching for semi-supervised learning,” In Proc. of the 26th Annual Int. Conf. on Machine Learning, pp. 441-448, 2009.

[B10] [10] K. Ozaki, M. Shimbo, M. Komachi, and Y. Matsumoto, “Using the mutual k-nearest neighbor graphs for semi-supervised classification of natural language Data,” In Proc. of the Fifteenth Conf. on Computational Natural Language Learning, pp. 154-162, 2011.

[B11] [11] G. Salton and M. J. McGill, “Introduction to Modern Information Retrieval,” McGraw-Hill, 1983.

[B12] [12] D. M. Blei, A. Y. Ng, andM. I. Jordan, “Latent Dirichlet allocation,” Machine Learning Research, Vol.3, pp. 993-1022, 2003.

[B13] [13] A. B. Goldberg and X. Zhu, “Seeing stars when there aren’t many stars: graph-based semi-supervised learning for sentiment categorization,” In Proc. of HLT-NAACL 2006 Workshop on TextGraphs: Graph-based Algorithms for Natural Language Processing, pp. 45-52, 2006.