JACIII Vol.18 No.5 pp. 818-822
doi: 10.20965/jaciii.2014.p0818


Label Propagation for Text Classification Using Latent Topics

Akiko Eriguchi and Ichiro Kobayashi

Advanced Sciences, Graduate School of Humanities and Sciences, Ochanomizu University, 2-1-1 Otsuka, Bunkyo-ku, Tokyo 112-8610, Japan

December 15, 2013
June 1, 2014
Online released:
September 20, 2014
September 20, 2014
graph-based semi-supervised learning, label propagation, text classification, latent Dirichlet allocation

The objective of this paper is to raise the accuracy of multiclass text classification through Graph-Based Semi-Supervised Learning (GBSSL). In GBSSL, it is essential to construct a proper graph which expresses the relation among nodes. We propose a method to construct a similarity graph by employing both surface information and latent information to express similarity between nodes. Experimenting on a Reuters-21578 corpus, we have confirmed that our proposal works well in raising the accuracy of GBSSL in a multiclass text classification task.

  1. [1] H. Scudder, “Probability of error of some adaptive patternrecognition machines,” IEEE Trans. on Information Theory, Vol.11, No.3, pp. 363-371, 1965.
  2. [2] A. Blum and T. Mitchell, “Combining labeled and unlabeled data with co-training,” In Proc. of the eleventh Annual Conf. on Computational Learning Theory, pp. 92-100, 1998.
  3. [3] T. Joachims, “Transductive Inference for Text Classification using Support Vector Machines,” In Proc. of the Sixteenth Int. Conf. on Machine Learning, pp. 200-209, 1999.
  4. [4] A. Subramanya and J. Bilmes, “Soft-Supervised Learning for Text Classification,” In Proc. of the 2008 Conf. on Empirical Methods in Natural Language Processing, pp. 1090-1099, 2008.
  5. [5] D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf, “Learning with Local and Global Consistency,” Advances in Neural Information Processing Systems, Vol.16, pp. 321-328, 2004.
  6. [6] X. Zhu, Z. Ghahramani, and J. Lafferty, “Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions,” In Proc. of the Twentieth Int. Conf. on Machine Learning, pp. 912-919, 2003.
  7. [7] X. Zhu, “Semi-supervised Learning with Graphs,” Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA, USA, 2005.
  8. [8] Q. Gu and J. Han, “Towards Active Learning on Graphs: An Error Bound Minimization Approach,” IEEE Int. Conf. on Data Mining, pp. 882-887, 2012.
  9. [9] T. Jebara, J. Wang, and S.-F. Chang, “Graph construction and bmatching for semi-supervised learning,” In Proc. of the 26th Annual Int. Conf. on Machine Learning, pp. 441-448, 2009.
  10. [10] K. Ozaki, M. Shimbo, M. Komachi, and Y. Matsumoto, “Using the mutual k-nearest neighbor graphs for semi-supervised classification of natural language Data,” In Proc. of the Fifteenth Conf. on Computational Natural Language Learning, pp. 154-162, 2011.
  11. [11] G. Salton and M. J. McGill, “Introduction to Modern Information Retrieval,” McGraw-Hill, 1983.
  12. [12] D. M. Blei, A. Y. Ng, andM. I. Jordan, “Latent Dirichlet allocation,” Machine Learning Research, Vol.3, pp. 993-1022, 2003.
  13. [13] A. B. Goldberg and X. Zhu, “Seeing stars when there aren’t many stars: graph-based semi-supervised learning for sentiment categorization,” In Proc. of HLT-NAACL 2006 Workshop on TextGraphs: Graph-based Algorithms for Natural Language Processing, pp. 45-52, 2006.

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, IE9,10,11, Opera.

Last updated on Mar. 24, 2017