A variable metric proximal stochastic gradient method: An application to classification problems
Due to the continued success of machine learning and deep learning in particular, supervised classification problems are ubiquitous in numerous scientific fields. Training these models typically involves the minimization of the empirical risk over large data sets along with a possibly non-differenti...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2024-01-01
|
| Series: | EURO Journal on Computational Optimization |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2192440624000054 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850111102968397824 |
|---|---|
| author | Pasquale Cascarano Giorgia Franchini Erich Kobler Federica Porta Andrea Sebastiani |
| author_facet | Pasquale Cascarano Giorgia Franchini Erich Kobler Federica Porta Andrea Sebastiani |
| author_sort | Pasquale Cascarano |
| collection | DOAJ |
| description | Due to the continued success of machine learning and deep learning in particular, supervised classification problems are ubiquitous in numerous scientific fields. Training these models typically involves the minimization of the empirical risk over large data sets along with a possibly non-differentiable regularization. In this paper, we introduce a stochastic gradient method for the considered classification problem. To control the variance of the objective's gradients, we use an automatic sample size selection along with a variable metric to precondition the stochastic gradient directions. Further, we utilize a non-monotone line search to automatize step size selection. Convergence results are provided for both convex and non-convex objective functions. Extensive numerical experiments verify that the suggested approach performs on par with state-of-the-art methods for training both statistical models for binary classification and artificial neural networks for multi-class image classification. The code is publicly available at https://github.com/koblererich/lisavm. |
| format | Article |
| id | doaj-art-505bebb3d6cb41b8b73795f758987d7d |
| institution | OA Journals |
| issn | 2192-4406 |
| language | English |
| publishDate | 2024-01-01 |
| publisher | Elsevier |
| record_format | Article |
| series | EURO Journal on Computational Optimization |
| spelling | doaj-art-505bebb3d6cb41b8b73795f758987d7d2025-08-20T02:37:41ZengElsevierEURO Journal on Computational Optimization2192-44062024-01-011210008810.1016/j.ejco.2024.100088A variable metric proximal stochastic gradient method: An application to classification problemsPasquale Cascarano0Giorgia Franchini1Erich Kobler2Federica Porta3Andrea Sebastiani4Department of the Arts, University of Bologna, Bologna, ItalyDepartment of Physics, Informatics and Mathematics, University of Modena and Reggio Emilia, Modena, Italy; Corresponding author.Department of Neuroradiology, University Hospital Bonn, Bonn, GermanyDepartment of Physics, Informatics and Mathematics, University of Modena and Reggio Emilia, Modena, ItalyDepartment of Mathematics, University of Bologna, Bologna, Italy; Department of Physics, Informatics and Mathematics, University of Modena and Reggio Emilia, Modena, ItalyDue to the continued success of machine learning and deep learning in particular, supervised classification problems are ubiquitous in numerous scientific fields. Training these models typically involves the minimization of the empirical risk over large data sets along with a possibly non-differentiable regularization. In this paper, we introduce a stochastic gradient method for the considered classification problem. To control the variance of the objective's gradients, we use an automatic sample size selection along with a variable metric to precondition the stochastic gradient directions. Further, we utilize a non-monotone line search to automatize step size selection. Convergence results are provided for both convex and non-convex objective functions. Extensive numerical experiments verify that the suggested approach performs on par with state-of-the-art methods for training both statistical models for binary classification and artificial neural networks for multi-class image classification. The code is publicly available at https://github.com/koblererich/lisavm.http://www.sciencedirect.com/science/article/pii/S2192440624000054Variable metricStochastic optimizationClassification problemDeep learning |
| spellingShingle | Pasquale Cascarano Giorgia Franchini Erich Kobler Federica Porta Andrea Sebastiani A variable metric proximal stochastic gradient method: An application to classification problems EURO Journal on Computational Optimization Variable metric Stochastic optimization Classification problem Deep learning |
| title | A variable metric proximal stochastic gradient method: An application to classification problems |
| title_full | A variable metric proximal stochastic gradient method: An application to classification problems |
| title_fullStr | A variable metric proximal stochastic gradient method: An application to classification problems |
| title_full_unstemmed | A variable metric proximal stochastic gradient method: An application to classification problems |
| title_short | A variable metric proximal stochastic gradient method: An application to classification problems |
| title_sort | variable metric proximal stochastic gradient method an application to classification problems |
| topic | Variable metric Stochastic optimization Classification problem Deep learning |
| url | http://www.sciencedirect.com/science/article/pii/S2192440624000054 |
| work_keys_str_mv | AT pasqualecascarano avariablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems AT giorgiafranchini avariablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems AT erichkobler avariablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems AT federicaporta avariablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems AT andreasebastiani avariablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems AT pasqualecascarano variablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems AT giorgiafranchini variablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems AT erichkobler variablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems AT federicaporta variablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems AT andreasebastiani variablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems |