Paper:
Kernel Canonical Discriminant Analysis Based on Variable Selection
Seiichi Ikeda and Yoshiharu Sato
Graduate School of Information Science and Technology, Hokkaido University
Kita 9, Nishi 14, Kita-ku, Sapporo 060-0814, Japan
We have shown that models support vector regression and classification are essentially linear in reproducing kernel Hilbert space (RKHS). To overcome the over fitting problem, a regularization term is added to the optimization process, deciding the coefficient of regularization term involves difficulties. We introduce the variable selection concept to the linear model in RKHS, where the kernel functions is treated as variable transformation when its value is given by observation. We show that kernel canonical discriminant functions for multiclass problems can be discussed under variable selection, which enables us to reduce the number of kernel functions in the discriminant function, i.e., the discriminant function is obtained as linear combinations of sufficiently small numbers of kernel functions, so, we can expect to get reasonable prediction. We discuss variable selection performance in canonical discriminant functions compared to support vector machines.
- [1] H. Akaike, “Information theory and an extension of the maximum likelihood principle,” B. N. Petrov and F. Csaki (Eds.), 2nd Int. Symposium on Information Theory, Akademisi Kiado, Budapest, pp. 267-281, 1973.
- [2] R. A. Fisher, “The use of multiple measurements in taxonomic problems,” Annals of Eugenics, Vol.7, pp. 179-188, 1936.
- [3] Y. Fujikoshi, “A test for additional information in Canonical correlation analysis,” Ann. Inst. Statist. Math., Vol.34, pp. 137-144, 1982.
- [4] D. G. Hartig, “The Riesz representation theorem revisited,”American Mathematical Monthly, Vol.90(4), pp. 277-280, 1990.
- [5] S. Ikeda, J. Tsuchiya, and Y. Sato, “Kernel Regression and Variable Selection Problem,” Proc. of The Ninth Japan-China Symposium on Statistics, pp. 75-80, 2007.
- [6] S. Konishi and G. Kitagawa, “Information Criterion and Statistical Modeling,” Springer Series in Statistics, Springer, 2008.
- [7] S. Mika, G. Ratsch, J. Weston, B. Scholkopf, and K. R. Mullers, “Fisher Discriminant Analysis with Kernels,” Proc. of the 1999 IEEE Signal Processing Society Workshop, pp. 41-48, 1999.
- [8] C. R. Rao, “Test of significance in multivariate analysis,”Biometrika, Vol.35, pp. 58-79, 1948.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.