A variable metric proximal stochastic gradient method: An application to classification problems

Due to the continued success of machine learning and deep learning in particular, supervised classification problems are ubiquitous in numerous scientific fields. Training these models typically involves the minimization of the empirical risk over large data sets along with a possibly non-differenti...

Full description

Saved in:
Bibliographic Details
Main Authors: Pasquale Cascarano, Giorgia Franchini, Erich Kobler, Federica Porta, Andrea Sebastiani
Format: Article
Language:English
Published: Elsevier 2024-01-01
Series:EURO Journal on Computational Optimization
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2192440624000054
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850111102968397824
author Pasquale Cascarano
Giorgia Franchini
Erich Kobler
Federica Porta
Andrea Sebastiani
author_facet Pasquale Cascarano
Giorgia Franchini
Erich Kobler
Federica Porta
Andrea Sebastiani
author_sort Pasquale Cascarano
collection DOAJ
description Due to the continued success of machine learning and deep learning in particular, supervised classification problems are ubiquitous in numerous scientific fields. Training these models typically involves the minimization of the empirical risk over large data sets along with a possibly non-differentiable regularization. In this paper, we introduce a stochastic gradient method for the considered classification problem. To control the variance of the objective's gradients, we use an automatic sample size selection along with a variable metric to precondition the stochastic gradient directions. Further, we utilize a non-monotone line search to automatize step size selection. Convergence results are provided for both convex and non-convex objective functions. Extensive numerical experiments verify that the suggested approach performs on par with state-of-the-art methods for training both statistical models for binary classification and artificial neural networks for multi-class image classification. The code is publicly available at https://github.com/koblererich/lisavm.
format Article
id doaj-art-505bebb3d6cb41b8b73795f758987d7d
institution OA Journals
issn 2192-4406
language English
publishDate 2024-01-01
publisher Elsevier
record_format Article
series EURO Journal on Computational Optimization
spelling doaj-art-505bebb3d6cb41b8b73795f758987d7d2025-08-20T02:37:41ZengElsevierEURO Journal on Computational Optimization2192-44062024-01-011210008810.1016/j.ejco.2024.100088A variable metric proximal stochastic gradient method: An application to classification problemsPasquale Cascarano0Giorgia Franchini1Erich Kobler2Federica Porta3Andrea Sebastiani4Department of the Arts, University of Bologna, Bologna, ItalyDepartment of Physics, Informatics and Mathematics, University of Modena and Reggio Emilia, Modena, Italy; Corresponding author.Department of Neuroradiology, University Hospital Bonn, Bonn, GermanyDepartment of Physics, Informatics and Mathematics, University of Modena and Reggio Emilia, Modena, ItalyDepartment of Mathematics, University of Bologna, Bologna, Italy; Department of Physics, Informatics and Mathematics, University of Modena and Reggio Emilia, Modena, ItalyDue to the continued success of machine learning and deep learning in particular, supervised classification problems are ubiquitous in numerous scientific fields. Training these models typically involves the minimization of the empirical risk over large data sets along with a possibly non-differentiable regularization. In this paper, we introduce a stochastic gradient method for the considered classification problem. To control the variance of the objective's gradients, we use an automatic sample size selection along with a variable metric to precondition the stochastic gradient directions. Further, we utilize a non-monotone line search to automatize step size selection. Convergence results are provided for both convex and non-convex objective functions. Extensive numerical experiments verify that the suggested approach performs on par with state-of-the-art methods for training both statistical models for binary classification and artificial neural networks for multi-class image classification. The code is publicly available at https://github.com/koblererich/lisavm.http://www.sciencedirect.com/science/article/pii/S2192440624000054Variable metricStochastic optimizationClassification problemDeep learning
spellingShingle Pasquale Cascarano
Giorgia Franchini
Erich Kobler
Federica Porta
Andrea Sebastiani
A variable metric proximal stochastic gradient method: An application to classification problems
EURO Journal on Computational Optimization
Variable metric
Stochastic optimization
Classification problem
Deep learning
title A variable metric proximal stochastic gradient method: An application to classification problems
title_full A variable metric proximal stochastic gradient method: An application to classification problems
title_fullStr A variable metric proximal stochastic gradient method: An application to classification problems
title_full_unstemmed A variable metric proximal stochastic gradient method: An application to classification problems
title_short A variable metric proximal stochastic gradient method: An application to classification problems
title_sort variable metric proximal stochastic gradient method an application to classification problems
topic Variable metric
Stochastic optimization
Classification problem
Deep learning
url http://www.sciencedirect.com/science/article/pii/S2192440624000054
work_keys_str_mv AT pasqualecascarano avariablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems
AT giorgiafranchini avariablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems
AT erichkobler avariablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems
AT federicaporta avariablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems
AT andreasebastiani avariablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems
AT pasqualecascarano variablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems
AT giorgiafranchini variablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems
AT erichkobler variablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems
AT federicaporta variablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems
AT andreasebastiani variablemetricproximalstochasticgradientmethodanapplicationtoclassificationproblems