Paper:

# L_{1}-Norm Least Squares Support Vector Regression via the Alternating Direction Method of Multipliers

## Ya-Fen Ye^{*,**}, Chao Ying^{***}, Yue-Xiang Jiang^{*}, and Chun-Na Li^{**}

^{*}College of Economics, Zhejiang University

Hangzhou 310027, China

^{**}Zhijiang College, Zhejiang University of Technology

Hangzhou 310024, China

^{***}Rainbow City Primary School

Hangzhou 310013, China

_{1}-norm, least squares, feature selection, ADMM

In this study, we focus on the feature selection problem in regression, and propose a new version of L_{1} support vector regression (L_{1}-SVR), known as L_{1}-norm least squares support vector regression (L_{1}-LSSVR). The alternating direction method of multipliers (ADMM), a method from the augmented Lagrangian family, is used to solve L_{1}-LSSVR. The sparse solution of L_{1}-LSSVR can realize feature selection effectively. Furthermore, L_{1}-LSSVR is decomposed into a sequence of simpler problems by the ADMM algorithm, resulting in faster training speed. The experimental results demonstrate that L_{1}-LSSVR is not only as effective as L_{1}-SVR, LSSVR, and SVR in both feature selection and regression, but also much faster than L_{1}-SVR and SVR.

- [1] H. Drucker, C. J. C. Burges, L. Kaufman, A. Smola, and V. Vapnik, “Support Vector Regression Machines,” Advances in Neural Information Processing Systems 9 (NIPS 1996), 1997.
- [2] C. Burges, “A tutorial on support vector machines for pattern recognition,” Data Min. Knowl. Discov. Vol.2, pp. 121-167, 1998.
- [3] J. Bi and K. P. Bennett, “A geometric approach to support vector regression,” Neurocomputing, Vol.55, No.1-2, pp. 79-108, 2003.
- [4] A. Smola and B. Schölkopf, “A tutorial on support vector regression,” Statistic Computing, Vol.14, No.3, pp. 199-222, 2004.
- [5] C. L. Huang and C. Y. Tsai, “A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting,” Expert Systems with Application, Vol.36, pp. 1529-1539, 2009.
- [6] J. B. Yang and C. J. Ong, “Feature selection using probabilistic prediction of support vector regression,” IEEE Trans. on Neural Networks, Vol.22, pp. 954-962, 2011.
- [7] Y. F. Ye, H. Cao, L. Bai, Z. Wang, and Y. H. Shao, “Exploring determinants of inflation in China based on L
_{1}-ε-twin support vector regression,” Procedia Computer Science, Vol.17, pp. 514-522,2013. - [8] X. Peng and D. Xu, “A local information-based feature-selection algorithm for data regression,” Pattern Recognition, Vol.46, pp. 2519-2530, 2013.
- [9] Y. F. Ye, Y. X. Jiang, Y. H. Shao, and C. N. Li, “Financial conditions index construction through weighted lp-norm support vector regression,” J. Adv. Comput. Intell. Intell. Inform., Vol.19, pp. 397-406, 2015.
- [10] Y. F. Ye, Y. H. Shao, and C. N. Li, “Wavelet lp-norm support vector regression with feature selection,” J. Adv. Comput. Intell. Intell. Inform., Vol.19, pp. 407-416, 2015.
- [11] R. Tibshirani, “Regression shrinkage and selection via the lasso,” J. of the Royal Statistical Society, Series B (Methodological), Vol.58, No.1, pp. 207-288, 1996.
- [12] L. Meier, S. van de Geer, and P. Bühlmann, “The group lasso for logistic regression,” J. of the Royal Statistical Society, Series B (Statistical Methodology), Vol.70, No.1, pp. 53-71, 2008.
- [13] J. A. K. Suykens, L. Lukas, P. Van Dooren, B. De Moor, and J. Vandewalle, “Least squares support vector machine classifiers: a large scale algorithm,” Proc. of European Conf. of Circuit Theory Design, 1999.
- [14] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends in Machine Learning, Vol.3, pp. 1-122, 2010.
- [15] J. M. Bioucas-Dias and M. A. T. Figueiredo, “Alternating direction algorithm for constrained sparse regression: application to hyperspectral unmixing,” 2010 2nd Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing, pp. 1-4, 2010.
- [16] J. A. K. Suykens, J. D. Brabanter, L. Lukas, and J. Vandewalle, “Weighted least squares support vector machines: robustness and sparse approximation,” Neurocomputing, Vol.48, pp. 85-105, 2002.
- [17] Y. F. Ye, Y. X. Jiang, Y. H. Shao, and C. N. Li, “Financial conditions index construction through weighted lp-norm support vector regression,” J. Adv. Comput. Intell. Intell. Inform., Vol.19, pp. 397-406, 2015.
- [18] Y. F. Ye, Y. H. Shao, and C. N. Li, “Wavelet lp-norm support vector regression with feature selection,” J. Adv. Comput. Intell. Intell. Inform., Vol.19, pp. 407-416, 2015.
- [19] Y. H. Shao, C. H. Zhang, Z. M. Yang, L. Jing, and N. Y. Deng, “An ε-twin support vector machine for regression,” Neural Computing and Applications, Vol.23, pp. 175-185, 2013.
- [20] O. L. Mangasarian and E. W. Wild, ”Feature selection for nonlinear kernel support vector machines,” IEEE 7th Int. Conf. on data mining, 2007.
- [21] R. J. Hyndman and A. B. Koehler, “Another look at measures of forecast accuracy,” Int. J. of Forecasting, Vol.22, pp. 679-688, 2006.
- [22] Y. F. Ye, Y. H. Shao, and W. J. Chen, “Comparing inflation forecasts using an ε-wavelet twin support vector regression,” J. of Information and Computational Science, Vol.10, pp. 2041-2049, 2013.
- [23] Q. Yu, Y. Miche, E. Séverin, and A. Lendasse, “Bankruptcy prediction using extreme learning machine and financial expertise,” Neurocomputing, Vol.128, pp. 296-302, 2014.