Paper:

# Hierarchical Semi-Supervised Factorization for Learning the Semantics

## Bin Shen^{*} and Olzhas Makhambetov^{**}

^{*}Computer Science Department, Purdue University, West Lafayette, IN., 47907, USA

^{**}Computer Science Laboratory, Nazarbayev University Research and Innovation System, 53, Kabanbay batyr ave., Astana, Kazakhstan

Most semi-supervised learning methods are based on extending existing supervised or unsupervised techniques by incorporating additional information from unlabeled or labeled data. Unlabeled instances help in learning statistical models that fully describe the global property of our data, whereas labeled instances make learned knowledge more human-interpretable. In this paper we present a novel way of extending conventional non-negativematrix factorization (NMF) and probabilistic latent semantic analysis (pLSA) to semi-supervised versions by incorporating label information for learning semantics. The proposed algorithm consists of two steps, first acquiring prior bases representing some classes from labeled data and second utilizing them to guide the learning of final bases that are semantically interpretable.

*J. Adv. Comput. Intell. Intell. Inform.*, Vol.18, No.3, pp. 366-374, 2014.

- [1] K. Nigam, A. K. McCallum, S. Thrun, and T. Mitchell, “Text Classification from Labeled and Unlabeled Documents using EM,” Mach. Learn., Vol.39, No.2-3, pp. 103-134, May-June 2000.
- [2] F. G. Cozman, I. Cohen, M. C. Cirelo, and E. Politcnica, “Semi-Supervised Learning of Mixture Models,” Proc. of the 20th Int. Conf. on Machine Learning (ICML’03), pp. 99-106, 2003.
- [3] A. Blum and S. Chawla, “Learning from Labeled and Unlabeled Data using Graph Mincuts,” Proc. of the Eighteenth Int. Conf. on Machine Learning (ICML’01), pp. 19-26, Morgan Kaufmann Publishers Inc., 2001.
- [4] X. Zhu, Z. Ghahramani, and J. Lafferty, “Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions,” Proc. of the Twentieth Int. Conf. on Machine Learning (ICML’03), pp. 912-919, 2003.
- [5] D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schlkopf, “Learning with local and global consistency,” Advances in Neural Information Processing Systems 16, pp. 321-328, MIT Press, 2004.
- [6] T. Joachims, “Transductive Inference for Text Classification using Support Vector Machines,” Proc. of the Sixteenth Int. Conf. on Machine Learning (ICML’99), pp. 200-209, Morgan Kaufmann Publishers Inc., 1999.
- [7] K. P. Bennett and A. Demiriz, “Semi-supervised support vector machines,” Advances in Neural Information Processing Systems, pp. 368-374, MIT Press, 1998.
- [8] B. Shen, Z. Datbayev, and O. Makhambetov, “Semisupervised Nonnegative Matrix Factorization for learning the semantics,” 2012 Joint 6th Int. Conf. on Soft Computing and Intelligent Systems and 13th Int. Symp. on Advanced Intelligent Systems (SCIS&ISIS2012), pp. 821-824, 2012.
- [9] D. D. Lee and H. S. Seung, “Algorithms for non-negative matrix factorization,” NIPS, pp. 556-562, MIT Press, 2000.
- [10] W. Xu, X. Liu, and Y. Gong, “Document clustering based on nonnegative matrix factorization,” Proc. of the 26th Annual Int. ACM SIGIR Conf. on Research and Development in Informaion Retrieval (SIGIR’03), pp. 267-273, 2003.
- [11] P. Hoyer, “Non-negative sparse coding,” Proc. of the 12th IEEE Workshop on Neural Networks for Signal Processing 2002, pp. 557-565, 2002.
- [12] J. Kim and H. Park, “Sparse Nonnegative Matrix Factorization for Clustering,” CSE Technical Reports, Georgia Institute of Technology, 2008.
- [13] S. Zhang, W. Wang, J. Ford, and F. Makedon, “Learning from Incomplete Ratings Using Non-negative Matrix Factorization,” SDM, 2006.
- [14] D. Cai, X. He, X. Wang, H. Bao, and J. Han, “Locality preserving nonnegative matrix factorization,” Proc. of the 21st Int. Joint Conf. on Artifical Intelligence (IJCAI’09), pp. 1010-1015, Morgan Kaufmann Publishers Inc., 2009.
- [15] Q. Gu and J. Zhou, “Neighborhood Preserving Nonnegative Matrix Factorization,” The 20th British Machine Vision Conf., 2009.
- [16] B. Shen and L. Si, “Nonnegative Matrix Factorization Clustering on Multiple Manifolds,” Proc. of the Twenty-Fourth AAAI Conf. on Artificial Intelligence, pp. 575-580, AAAI Press, 2010.
- [17] C. Ding, T. Li, and M. I. Jordan, “Convex and Semi-Nonnegative Matrix Factorizations,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.99, No.2, 2008.
- [18] B. Shen, Z. Datbayev, and O. Makhambetov, “Direct robust Non-Negative Matrix Factorization and its application on image processing,” 2012 6th Int. Conf. on Application of Information and Communication Technologies (AICT), pp. 1-5, 2012.
- [19] B. Shen, L. Si, R. Ji, and B.-D. Liu, “Robust Nonnegative Matrix Factorization via L1 Norm Regularization,” CoRR, abs/1204.2311, 2012.
- [20] D. Guillamet and J. Vitri, “Non-negative Matrix Factorization for Face Recognition,” M. Escrig, F. Toledo, and E. Golobardes (ed.), Topics in Artificial Intelligence, Vol.2504 of Lecture Notes in Computer Science, pp. 336-344, Springer Berlin Heidelberg, 2002.
- [21] Y. Wu, B. Shen, and H. Ling, “Visual Tracking via Online Nonnegative Matrix Factorization,” IEEE Trans. on Circuits and Systems for Video Technology, Vol.24, No.3, pp. 374-383, March 2014.
- [22] Y. Chen, L. Wang, and M. Dong, “Non-Negative Matrix Factorization for Semisupervised Heterogeneous Data Coclustering,” IEEE Trans. on Knowledge and Data Engineering, Vol.22, No.10, pp. 1459-1474, 2010.
- [23] C. Wang, S. Yan, L. Zhang, and H.-J. Zhang, “Non-Negative Semi-Supervised Learning,” 2009.
- [24] H. Lee, J. Yoo, and S. Choi, “Semi-supervised nonnegative matrix factorization,” Signal Processing Letters, IEEE, Vol.17, No.1, pp. 4-7, 2010.
- [25] T. Hofmann, “Probabilistic latent semantic analysis,” UAI1999.
- [26] D.M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,” J. Mach. Learn. Res., Vol.3, pp. 993-1022, 2003.
- [27] M. V. S. Shashanka, B. Raj, and P. Smaragdis, “Sparse Overcomplete Latent Variable Decomposition of Counts Data,” J. C. Platt, D. Koller, Y. Singer, and S. T. Roweis (ed.), NIPS, MIT Press, 2007.
- [28] Y. Lu and C. Zhai, “Opinion integration through semi-supervised topic modeling,” Proc. of the 17th Int. Conf. on World Wide Web (WWW’08), pp. 121-130, 2008.
- [29] W. Xu, X. Liu, and Y. Gong, “Document clustering based on nonnegative matrix factorization,” SIGIR’03, pp. 267-273, 2003.
- [30] D. Cai, X. He, and J. Han, “Graph Regularized Non-negative Matrix Factorization for Data Representation,” UIUC Computer Science Research and Tech Reports, 2008.
- [31] L. Lovasz and M. D. Plummer, “Matching Theory,” Akademiai Kiado, 1986.