Stochastic Gradient Descent for Kernel-Based Maximum Correntropy Criterion

Maximum correntropy criterion (MCC) has been an important method in machine learning and signal processing communities since it was successfully applied in various non-Gaussian noise scenarios. In comparison with the classical least squares method (LS), which takes only the second-order moment of mo...

Full description

Saved in:
Bibliographic Details
Main Authors: Tiankai Li, Baobin Wang, Chaoquan Peng, Hong Yin
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/26/12/1104
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850050859613814784
author Tiankai Li
Baobin Wang
Chaoquan Peng
Hong Yin
author_facet Tiankai Li
Baobin Wang
Chaoquan Peng
Hong Yin
author_sort Tiankai Li
collection DOAJ
description Maximum correntropy criterion (MCC) has been an important method in machine learning and signal processing communities since it was successfully applied in various non-Gaussian noise scenarios. In comparison with the classical least squares method (LS), which takes only the second-order moment of models into consideration and belongs to the convex optimization problem, MCC captures the high-order information of models that play crucial roles in robust learning, which is usually accompanied by solving the non-convexity optimization problems. As we know, the theoretical research on convex optimizations has made significant achievements, while theoretical understandings of non-convex optimization are still far from mature. Motivated by the popularity of the stochastic gradient descent (SGD) for solving nonconvex problems, this paper considers SGD applied to the kernel version of MCC, which has been shown to be robust to outliers and non-Gaussian data in nonlinear structure models. As the existing theoretical results for the SGD algorithm applied to the kernel MCC are not well established, we present the rigorous analysis for the convergence behaviors and provide explicit convergence rates under some standard conditions. Our work can fill the gap between optimization process and convergence during the iterations: the iterates need to converge to the global minimizer while the obtained estimator cannot ensure the global optimality in the learning process.
format Article
id doaj-art-f2e7e94d7da04bdf86643083ba7f3b27
institution DOAJ
issn 1099-4300
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj-art-f2e7e94d7da04bdf86643083ba7f3b272025-08-20T02:53:19ZengMDPI AGEntropy1099-43002024-12-012612110410.3390/e26121104Stochastic Gradient Descent for Kernel-Based Maximum Correntropy CriterionTiankai Li0Baobin Wang1Chaoquan Peng2Hong Yin3School of Mathematics and Statistics, South-Central MinZu University, Wuhan 430074, ChinaSchool of Mathematics and Statistics, South-Central MinZu University, Wuhan 430074, ChinaSchool of Mathematics and Statistics, South-Central MinZu University, Wuhan 430074, ChinaSchool of Mathematics, Renmin University of China, Beijing 100872, ChinaMaximum correntropy criterion (MCC) has been an important method in machine learning and signal processing communities since it was successfully applied in various non-Gaussian noise scenarios. In comparison with the classical least squares method (LS), which takes only the second-order moment of models into consideration and belongs to the convex optimization problem, MCC captures the high-order information of models that play crucial roles in robust learning, which is usually accompanied by solving the non-convexity optimization problems. As we know, the theoretical research on convex optimizations has made significant achievements, while theoretical understandings of non-convex optimization are still far from mature. Motivated by the popularity of the stochastic gradient descent (SGD) for solving nonconvex problems, this paper considers SGD applied to the kernel version of MCC, which has been shown to be robust to outliers and non-Gaussian data in nonlinear structure models. As the existing theoretical results for the SGD algorithm applied to the kernel MCC are not well established, we present the rigorous analysis for the convergence behaviors and provide explicit convergence rates under some standard conditions. Our work can fill the gap between optimization process and convergence during the iterations: the iterates need to converge to the global minimizer while the obtained estimator cannot ensure the global optimality in the learning process.https://www.mdpi.com/1099-4300/26/12/1104stochastic gradient descentmaximum correntropy criterionnon-Gaussianconvergence rate
spellingShingle Tiankai Li
Baobin Wang
Chaoquan Peng
Hong Yin
Stochastic Gradient Descent for Kernel-Based Maximum Correntropy Criterion
Entropy
stochastic gradient descent
maximum correntropy criterion
non-Gaussian
convergence rate
title Stochastic Gradient Descent for Kernel-Based Maximum Correntropy Criterion
title_full Stochastic Gradient Descent for Kernel-Based Maximum Correntropy Criterion
title_fullStr Stochastic Gradient Descent for Kernel-Based Maximum Correntropy Criterion
title_full_unstemmed Stochastic Gradient Descent for Kernel-Based Maximum Correntropy Criterion
title_short Stochastic Gradient Descent for Kernel-Based Maximum Correntropy Criterion
title_sort stochastic gradient descent for kernel based maximum correntropy criterion
topic stochastic gradient descent
maximum correntropy criterion
non-Gaussian
convergence rate
url https://www.mdpi.com/1099-4300/26/12/1104
work_keys_str_mv AT tiankaili stochasticgradientdescentforkernelbasedmaximumcorrentropycriterion
AT baobinwang stochasticgradientdescentforkernelbasedmaximumcorrentropycriterion
AT chaoquanpeng stochasticgradientdescentforkernelbasedmaximumcorrentropycriterion
AT hongyin stochasticgradientdescentforkernelbasedmaximumcorrentropycriterion