Paper:

# Applying Naive Bayes Classifier to Document Clustering

## Jie Ji and Qiangfu Zhao

System Intelligence Lab., The University of Aizu, Tsuruga, Ikki-machi, Aizu-wakamatsu, Fukushima 965-8580, Japan

*k*-means, comparative advantage

*J. Adv. Comput. Intell. Intell. Inform.*, Vol.14 No.6, pp. 624-630, 2010.

- [1] P. Domingos and M. Pazzani, “On the optimality of the simple Bayesian classifier under zero-one loss,” Machine Learning, Vol.29, Nos.2-3, pp. 103-137, 1997.
- [2] M. Mozina, J. Demsar ,M. Kattan, and B. Zupan, “Nomograms for Visualization of Naive Bayesian Classifier,” In Proc. of PKDD-2004, pp. 337-348, 2004.
- [3] S. Kotsiantis and P. Pintelas, “Increasing the Classification Accuracy of Simple Bayesian Classifier,” Lecture Notes in Artificial Intelligence, AIMSA 2004, Springer-Verlag Vol.3192, pp. 198-207, 2004.
- [4] Bauer and Laurie, “Introducing linguistic morphology, 2nd Ed.,” Washington, D.C., Georgetown University Press, 2003.
- [5] W. B. Frakes and R. Baeza-Yates, “Information Retrieval: Data Structures and Algorithms,” Prentice Hall, Englewood Cliffs, New Jersey, 1992.
- [6] J. B. MacQueen, “Some Methods for classification and Analysis of Multivariate Observations,” Proc. of 5-th Berkeley Symp. on Mathematical Statistics and Probability, Berkeley, University of California Press, pp. 281-297, 1967.
- [7] J. A. Hartigan, “Clustering Algorithms,” Wiley, 1975.
- [8] J. A. Hartigan and M. A. Wong, “A K-Means Clustering Algorithm,” Applied Statistics, Vol.28, No.1, pp. 100-108, 1979.
- [9] J. Ji and Q. Zhao, “Comparative Advantage Approach for Sparse Text Data Clustering,” Proc. of IEEE 9-th Int. Conf. on Computer and Information Technology, Xiamen, China, pp. 3-8, 2009.
- [10] J. Ji, T. Y. T. Chan, and Q. Zhao, “Fast Document Clustering Based on Weighted Comparative Advantage,” Proc. of IEEE Int. Conf. on Systems, Man & Cybernetics, San Antonio, Texas, USA, pp. 541-546, 2009.
- [11] P. Hardwick, B. Khan, and J. Langmead, “An Introduction to Modern Economics, 5th Ed.,” Financial Times & Prentice Hall, 1999.
- [12] A. O’Sullivan and S. M. Sheffrin, “Economics, Principles & Tools, 3th Ed.,” Prentice Hall, 2002.
- [13] S. P. Lloyd, “Least Squares Quantization in PCM,” IEEE, Trans. on Information Theory, Vol.28, No.2, pp. 129-137, 1982.
- [14] G. Salton and C. Buckley, “Term-weighting approaches in automatic text retrieval,” Information Processing & Management, Vol.24, No.5, pp. 513-523, 1988.
- [15] I. S. Dhillon and D. S. Modha, “Concept Decompositions for Large Sparse Text Data Using Clustering,” Machine Learning, Vol.42, Nos.1-2, pp. 143-175, 2001.
- [16] G. Salton and M. J. McGill, “Introduction to Modern Information Retrieval,” McGraw-Hill Book Company, 1983.
- [17] ftp://ftp.cs.cornell.edu/pub/smart
- [18] http://www.nsf.gov/awardsearch

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.