Stochastic Variance Reduced Primal–Dual Hybrid Gradient Methods for Saddle-Point Problems

Recently, many stochastic Alternating Direction Methods of Multipliers (ADMMs) have been proposed to solve large-scale machine learning problems. However, for large-scale saddle-point problems, the state-of-the-art (SOTA) stochastic ADMMs still have high per-iteration costs. On the other hand, the s...

Full description

Saved in:
Bibliographic Details
Main Authors: Weixin An, Yuanyuan Liu, Fanhua Shang, Hongying Liu
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/13/10/1687
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849711720014020608
author Weixin An
Yuanyuan Liu
Fanhua Shang
Hongying Liu
author_facet Weixin An
Yuanyuan Liu
Fanhua Shang
Hongying Liu
author_sort Weixin An
collection DOAJ
description Recently, many stochastic Alternating Direction Methods of Multipliers (ADMMs) have been proposed to solve large-scale machine learning problems. However, for large-scale saddle-point problems, the state-of-the-art (SOTA) stochastic ADMMs still have high per-iteration costs. On the other hand, the stochastic primal–dual hybrid gradient (SPDHG) has a low per-iteration cost but only a suboptimal convergence rate of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="sans-serif">𝒪</mi><mo stretchy="false">(</mo><mn>1</mn><mo>/</mo><msqrt><mi>S</mi></msqrt><mo stretchy="false">)</mo></mrow></semantics></math></inline-formula>. Thus, there still remains a gap in the convergence rates between SPDHG and SOTA ADMMs. Motivated by the two matters, we propose (accelerated) stochastic variance reduced primal–dual hybrid gradient ((A)SVR-PDHG) methods. We design a linear extrapolation step to improve the convergence rate and a new adaptive epoch length strategy to remove the extra boundedness assumption. Our algorithms have a simpler structure and lower per-iteration complexity than SOTA ADMMs. As a by-product, we present the asynchronous parallel variants of our algorithms. In theory, we rigorously prove that our methods converge linearly for strongly convex problems and improve the convergence rate to <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="sans-serif">𝒪</mi><mo stretchy="false">(</mo><mn>1</mn><mo>/</mo><msup><mi>S</mi><mn>2</mn></msup><mo stretchy="false">)</mo></mrow></semantics></math></inline-formula> for non-strongly convex problems as opposed to the existing <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="sans-serif">𝒪</mi><mo stretchy="false">(</mo><mn>1</mn><mo>/</mo><mi>S</mi><mo stretchy="false">)</mo></mrow></semantics></math></inline-formula> convergence rate. Compared with SOTA algorithms, various experimental results demonstrate that ASVR-PDHG can achieve an average speedup of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>2</mn><mo>×</mo><mo>∼</mo><mn>5</mn><mo>×</mo></mrow></semantics></math></inline-formula>.
format Article
id doaj-art-c9b4dee943574e8f95ffc5625123d3e5
institution DOAJ
issn 2227-7390
language English
publishDate 2025-05-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj-art-c9b4dee943574e8f95ffc5625123d3e52025-08-20T03:14:32ZengMDPI AGMathematics2227-73902025-05-011310168710.3390/math13101687Stochastic Variance Reduced Primal–Dual Hybrid Gradient Methods for Saddle-Point ProblemsWeixin An0Yuanyuan Liu1Fanhua Shang2Hongying Liu3The Key Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education, School of Artificial Intelligence, Xidian University, Xi’an 710126, ChinaThe Key Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education, School of Artificial Intelligence, Xidian University, Xi’an 710126, ChinaThe College of Intelligence and Computing, Tianjin University, Tianjin 300072, ChinaMedical College, Tianjin University, Tianjin 300072, ChinaRecently, many stochastic Alternating Direction Methods of Multipliers (ADMMs) have been proposed to solve large-scale machine learning problems. However, for large-scale saddle-point problems, the state-of-the-art (SOTA) stochastic ADMMs still have high per-iteration costs. On the other hand, the stochastic primal–dual hybrid gradient (SPDHG) has a low per-iteration cost but only a suboptimal convergence rate of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="sans-serif">𝒪</mi><mo stretchy="false">(</mo><mn>1</mn><mo>/</mo><msqrt><mi>S</mi></msqrt><mo stretchy="false">)</mo></mrow></semantics></math></inline-formula>. Thus, there still remains a gap in the convergence rates between SPDHG and SOTA ADMMs. Motivated by the two matters, we propose (accelerated) stochastic variance reduced primal–dual hybrid gradient ((A)SVR-PDHG) methods. We design a linear extrapolation step to improve the convergence rate and a new adaptive epoch length strategy to remove the extra boundedness assumption. Our algorithms have a simpler structure and lower per-iteration complexity than SOTA ADMMs. As a by-product, we present the asynchronous parallel variants of our algorithms. In theory, we rigorously prove that our methods converge linearly for strongly convex problems and improve the convergence rate to <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="sans-serif">𝒪</mi><mo stretchy="false">(</mo><mn>1</mn><mo>/</mo><msup><mi>S</mi><mn>2</mn></msup><mo stretchy="false">)</mo></mrow></semantics></math></inline-formula> for non-strongly convex problems as opposed to the existing <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="sans-serif">𝒪</mi><mo stretchy="false">(</mo><mn>1</mn><mo>/</mo><mi>S</mi><mo stretchy="false">)</mo></mrow></semantics></math></inline-formula> convergence rate. Compared with SOTA algorithms, various experimental results demonstrate that ASVR-PDHG can achieve an average speedup of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>2</mn><mo>×</mo><mo>∼</mo><mn>5</mn><mo>×</mo></mrow></semantics></math></inline-formula>.https://www.mdpi.com/2227-7390/13/10/1687saddle-point problemstochastic optimizationvariance reductionasynchronous parallelism
spellingShingle Weixin An
Yuanyuan Liu
Fanhua Shang
Hongying Liu
Stochastic Variance Reduced Primal–Dual Hybrid Gradient Methods for Saddle-Point Problems
Mathematics
saddle-point problem
stochastic optimization
variance reduction
asynchronous parallelism
title Stochastic Variance Reduced Primal–Dual Hybrid Gradient Methods for Saddle-Point Problems
title_full Stochastic Variance Reduced Primal–Dual Hybrid Gradient Methods for Saddle-Point Problems
title_fullStr Stochastic Variance Reduced Primal–Dual Hybrid Gradient Methods for Saddle-Point Problems
title_full_unstemmed Stochastic Variance Reduced Primal–Dual Hybrid Gradient Methods for Saddle-Point Problems
title_short Stochastic Variance Reduced Primal–Dual Hybrid Gradient Methods for Saddle-Point Problems
title_sort stochastic variance reduced primal dual hybrid gradient methods for saddle point problems
topic saddle-point problem
stochastic optimization
variance reduction
asynchronous parallelism
url https://www.mdpi.com/2227-7390/13/10/1687
work_keys_str_mv AT weixinan stochasticvariancereducedprimaldualhybridgradientmethodsforsaddlepointproblems
AT yuanyuanliu stochasticvariancereducedprimaldualhybridgradientmethodsforsaddlepointproblems
AT fanhuashang stochasticvariancereducedprimaldualhybridgradientmethodsforsaddlepointproblems
AT hongyingliu stochasticvariancereducedprimaldualhybridgradientmethodsforsaddlepointproblems