Articles

LEARNING RATES OF KERNEL-BASED ROBUST CLASSIFICATION

  • Shuhua WANG ,
  • Baohuai SHENG
Expand
  • 1. School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, 333403, China;
    2. Department of Finance, Zhejiang Yuexiu University, Shaoxing, 312030, China;
    3. Department of Applied Statistics, Shaoxing University, Shaoxing, 312000, China

Received date: 2020-12-24

  Revised date: 2021-04-20

  Online published: 2022-06-24

Supported by

This work is supported by the NSF (61877039), the NSFC/RGC Joint Research Scheme (12061160462 and N CityU 102/20) of China, the NSF (LY19F020013) of Zhejiang Province, the Special Project for Scientific and Technological Cooperation (20212BDH80021) of Jiangxi Province, the Science and Technology Project in Jiangxi Province Department of Education (GJJ211334).

Abstract

This paper considers a robust kernel regularized classification algorithm with a non-convex loss function which is proposed to alleviate the performance deterioration caused by the outliers. A comparison relationship between the excess misclassification error and the excess generalization error is provided; from this, along with the convex analysis theory, a kind of learning rate is derived. The results show that the performance of the classifier is effected by the outliers, and the extent of impact can be controlled by choosing the homotopy parameters properly.

Cite this article

Shuhua WANG , Baohuai SHENG . LEARNING RATES OF KERNEL-BASED ROBUST CLASSIFICATION[J]. Acta mathematica scientia, Series B, 2022 , 42(3) : 1173 -1190 . DOI: 10.1007/s10473-022-0321-7

References

[1] Chen D R, Wu Q, Ying Y M, Zhou D X. Support vector machine soft margin classifier:error analysis[J]. J Mach Learn Res, 2004, 5:1143-1175
[2] Xiang D H, Zhou D X. Classification with Guassians and convex loss[J]. J Mach Learn Res, 2009, 10:1447-1468
[3] Cao F L, Dai T H, Zhang Y Q. Generalization bounds of compressed regression learning algorithm[J]. Acta Mathematica Scientia, 2014, 34A(4):905-916
[4] Zhou D X, Jetter K. Approximation with polynomial kernels and SVM classifiers[J]. Adv Comput Math, 2006, 25:323-344
[5] Sheng B H, Xiang D H. The learning rate of l2-coefficient regularized classification with strong loss[J]. Acta Mathematica Sinica, 2013, 29B(12):2397-2408
[6] Xiang D H. Classification with Gaussians and convex loss II:Improving error bounds by noise conditions[J]. Science in China:Mathematics, 2011, 54(10):165-171
[7] Tikhonov A, Arsenin V. Solutions of ill-posed problems[M]. W H Winston, 1977
[8] Hou Z Y, Yang H Q. Regularization method for improving optimal convergence rate of the regularized solution of ill-posed problems[J]. Acta Mathematica Scientia, 1998, 18B(2):177-185
[9] Chen H. On the asymptotic order of Tikhonov regularization with operators and data error[J]. Acta Mathematica Scientia, 1998, 18B(1):35-44
[10] Pang Meng J, Sun H W. Distributed learning with partial coefficients regularization[J/OL]. Int J Wavelets Multiresolut Inf Process, 2018, 29(04):1850025. https://doi.org/10.1142/S021969131850025X
[11] Wu Q, Ying Y M, Zhou D X. Multi-kernel regularized classifiers[J]. J Complexity, 2007, 23:108-134
[12] Zhang T. Statistical behavior and consistency of classification methods based on convex risk minimization[J]. Annals of Statistics, 2004, 32:56-85
[13] Zhang J, Wang J L, Sheng B H. Learning from regularized regression algorithms with p-order Markov chain sampling[J]. Appl Math J Chinese Univ, 2011, 26B(3):295-306
[14] Tong H Z, Chen D R, Peng L Z. Learning rates for regularized classifiers using multivariate polynomial kernels[J]. J Complexity, 2008, 24:619-631
[15] Rosasco L, De VitoE, Caponnetto A, Piana M. Are loss functions all the same?[J]. Neural Computation, 2014, 16(5):1063-1076
[16] Cristianini N, Shawe-Taylor J. An Introduction to Support Vector Machines[M]. Cambridge University Press, 2000
[17] Steinwart I, Christman A. Support vector machines[M/OL]. Springer, 2008. https://doi.org/10.1007/978-0-387-77242-4
[18] Sun H W, Liu P. Regularized least square algorithm with two kernels[J]. Int J Wavelets Multiresolut Inf Process, 2012, 10(5):1250043. Doi:https://doi.org/10.1142/S0219691312500439
[19] Sun H W, Liu P. The optimal solution of multi-kernel regularization learning[J]. Acta Mathematica Sinica, English Series, 2013, 29(8):1607-1616
[20] Zhao Z H, Lin Y Z. Reproducing kernel method for piecewise smooth boundary value problems[J]. Acta Mathematica Scientia, 2020, 40A(5):1333-1340
[21] Sheng B H. The weighted norm for some Mercer kernel matrices[J]. Acta Mathematica Scientia, 2013, 33A(1):6-15
[22] Sheng B H, Zhang H Z. Performance analysis of the LapRSSLG algorithm in learning theory[J]. Anal Appl, 2020, 18(1):79-108
[23] Rachdi L T, Msehli N. Best approximation for weierstrass transform connected with spherical mean operator[J]. Acta Mathematica Scientia, 2012, 32B(2):455-470
[24] Aggarwal Charu C. Outlier Analysis[M]. Springer, 2017
[25] Hampel F R, Ronchetti E M, Rousseeuw P J, Stahel W A. Robust statistics:the approach based on influence functions. New York:John Wiley Sons, 1986
[26] Tsyurmasto P, Zabarankin M. Value-at-risk support vector machine:stability to outliers[J]. J Comb Optim, 2014, 28:218-232
[27] Huber P, Ronchetti E. Robust Statistics[M]. 2nd edition. Wiley, 2009
[28] Wu Y, Liu Y. Robust truncated hinge loss support vector machine[J]. J American Statistical Association, 2007, 102:974-983
[29] Freund Y. A more robust boosting algorithm[J/OL]. Ann Statist, May 2009. Preprint arXiv:0905.2138. http://arxiv.org/abs/0905.2138
[30] Masnadi-Shiraze H, Vasconcelos N. On the design of loss functions for classification:theory, robustness to outliers, and savegeboost[J]. In Advances in Neural Information Processing Systems, 2009, 22:1049-1056
[31] Suzumura S, Ogawa K, Sugiyama, Karasuyama M, Takeuchi I. Homotopy continuation approaches for robust SV classification and regression[J]. Mach Learn, 2017, 106:1009-1038
[32] Sheng B H, Liu H X, Wang H M. Learning rates for the kernel regularized regression with a differentiable stongly convex loss[J]. Commun Pur Appl Anal, 2020, 19(8):3973-4005
[33] Sheng B H, Wang J L. On the K-functional in learning theory[J]. Anal Appl, 2020, 18(3):423-446
[34] Sheng B H, Wang J L, Xiang D H. Error analysis on Hérmite learning with gradient data[J]. Chin Ann Math Ser B, 2018, 39(4):705-720
[35] Wang S H, Chen Z L, Sheng B H. Convergence of online pairwise regression learning with quadratic loss[J]. Commun Pur Appl Anal, 2020, 19(8):4023-4054
[36] Sheng B H, Zuo L. Error analysis of the kernel regularized regression based on refined convex losses and RKBSs[J/OL]. Int J Wavelets Multiresolut Inf Process, 2021:2150012 (52 pages). https://doi.org/10.1142/S0219691321500120
[37] Bartlett P L, Jordan M I, McAuliffe J D. Convex, classification, and risk bounds[J]. Journal of American Statistical Association, 2006, 101:138-156
[38] Wang S H, Wang Y J, Chen Z L, Sheng B H. The convergence rate for kernel-based regularized pair learning algorithm with a quasiconvex loss[J]. J Sys Sci & Math Scis, 2020, 40(3):389-409 (in Chinese)
[39] Wang S H, Chen Z L, Sheng B H. The convergence rate of SVM for kernel-based robust regression[J/OL]. Int J Wavelets Multiresolut Inf Process, 2019, 17(1):1950004 (21 pages). https://doi.org/10.1142/S0219691319500048
[40] Cambini A, Martein L. Generalized convexity and optimization:theory and applications[M]. Springerverlag, 2009
[41] Micchelli C A, Xu Y S, Zhang H Z. Universal kernels[J]. J Mach Learn Res, 2006, 7:2651-2667
[42] Smale S, Zhou D X. Estimating the approximation error in learning theory[J]. Anal and Appli, 2003, 1(1):1-25
[43] Smale S, Zhou D X. Learning theory estimates via integral operators and their applications[J]. Constr Approx, 2007, 26(2):153-172
Options
Outlines

/