Stochastic Variance Reduced Primal–Dual Hybrid Gradient Methods for Saddle-Point Problems
Recently, many stochastic Alternating Direction Methods of Multipliers (ADMMs) have been proposed to solve large-scale machine learning problems. However, for large-scale saddle-point problems, the state-of-the-art (SOTA) stochastic ADMMs still have high per-iteration costs. On the other hand, the s...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Mathematics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-7390/13/10/1687 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849711720014020608 |
|---|---|
| author | Weixin An Yuanyuan Liu Fanhua Shang Hongying Liu |
| author_facet | Weixin An Yuanyuan Liu Fanhua Shang Hongying Liu |
| author_sort | Weixin An |
| collection | DOAJ |
| description | Recently, many stochastic Alternating Direction Methods of Multipliers (ADMMs) have been proposed to solve large-scale machine learning problems. However, for large-scale saddle-point problems, the state-of-the-art (SOTA) stochastic ADMMs still have high per-iteration costs. On the other hand, the stochastic primal–dual hybrid gradient (SPDHG) has a low per-iteration cost but only a suboptimal convergence rate of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="sans-serif">𝒪</mi><mo stretchy="false">(</mo><mn>1</mn><mo>/</mo><msqrt><mi>S</mi></msqrt><mo stretchy="false">)</mo></mrow></semantics></math></inline-formula>. Thus, there still remains a gap in the convergence rates between SPDHG and SOTA ADMMs. Motivated by the two matters, we propose (accelerated) stochastic variance reduced primal–dual hybrid gradient ((A)SVR-PDHG) methods. We design a linear extrapolation step to improve the convergence rate and a new adaptive epoch length strategy to remove the extra boundedness assumption. Our algorithms have a simpler structure and lower per-iteration complexity than SOTA ADMMs. As a by-product, we present the asynchronous parallel variants of our algorithms. In theory, we rigorously prove that our methods converge linearly for strongly convex problems and improve the convergence rate to <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="sans-serif">𝒪</mi><mo stretchy="false">(</mo><mn>1</mn><mo>/</mo><msup><mi>S</mi><mn>2</mn></msup><mo stretchy="false">)</mo></mrow></semantics></math></inline-formula> for non-strongly convex problems as opposed to the existing <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="sans-serif">𝒪</mi><mo stretchy="false">(</mo><mn>1</mn><mo>/</mo><mi>S</mi><mo stretchy="false">)</mo></mrow></semantics></math></inline-formula> convergence rate. Compared with SOTA algorithms, various experimental results demonstrate that ASVR-PDHG can achieve an average speedup of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>2</mn><mo>×</mo><mo>∼</mo><mn>5</mn><mo>×</mo></mrow></semantics></math></inline-formula>. |
| format | Article |
| id | doaj-art-c9b4dee943574e8f95ffc5625123d3e5 |
| institution | DOAJ |
| issn | 2227-7390 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Mathematics |
| spelling | doaj-art-c9b4dee943574e8f95ffc5625123d3e52025-08-20T03:14:32ZengMDPI AGMathematics2227-73902025-05-011310168710.3390/math13101687Stochastic Variance Reduced Primal–Dual Hybrid Gradient Methods for Saddle-Point ProblemsWeixin An0Yuanyuan Liu1Fanhua Shang2Hongying Liu3The Key Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education, School of Artificial Intelligence, Xidian University, Xi’an 710126, ChinaThe Key Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education, School of Artificial Intelligence, Xidian University, Xi’an 710126, ChinaThe College of Intelligence and Computing, Tianjin University, Tianjin 300072, ChinaMedical College, Tianjin University, Tianjin 300072, ChinaRecently, many stochastic Alternating Direction Methods of Multipliers (ADMMs) have been proposed to solve large-scale machine learning problems. However, for large-scale saddle-point problems, the state-of-the-art (SOTA) stochastic ADMMs still have high per-iteration costs. On the other hand, the stochastic primal–dual hybrid gradient (SPDHG) has a low per-iteration cost but only a suboptimal convergence rate of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="sans-serif">𝒪</mi><mo stretchy="false">(</mo><mn>1</mn><mo>/</mo><msqrt><mi>S</mi></msqrt><mo stretchy="false">)</mo></mrow></semantics></math></inline-formula>. Thus, there still remains a gap in the convergence rates between SPDHG and SOTA ADMMs. Motivated by the two matters, we propose (accelerated) stochastic variance reduced primal–dual hybrid gradient ((A)SVR-PDHG) methods. We design a linear extrapolation step to improve the convergence rate and a new adaptive epoch length strategy to remove the extra boundedness assumption. Our algorithms have a simpler structure and lower per-iteration complexity than SOTA ADMMs. As a by-product, we present the asynchronous parallel variants of our algorithms. In theory, we rigorously prove that our methods converge linearly for strongly convex problems and improve the convergence rate to <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="sans-serif">𝒪</mi><mo stretchy="false">(</mo><mn>1</mn><mo>/</mo><msup><mi>S</mi><mn>2</mn></msup><mo stretchy="false">)</mo></mrow></semantics></math></inline-formula> for non-strongly convex problems as opposed to the existing <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="sans-serif">𝒪</mi><mo stretchy="false">(</mo><mn>1</mn><mo>/</mo><mi>S</mi><mo stretchy="false">)</mo></mrow></semantics></math></inline-formula> convergence rate. Compared with SOTA algorithms, various experimental results demonstrate that ASVR-PDHG can achieve an average speedup of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>2</mn><mo>×</mo><mo>∼</mo><mn>5</mn><mo>×</mo></mrow></semantics></math></inline-formula>.https://www.mdpi.com/2227-7390/13/10/1687saddle-point problemstochastic optimizationvariance reductionasynchronous parallelism |
| spellingShingle | Weixin An Yuanyuan Liu Fanhua Shang Hongying Liu Stochastic Variance Reduced Primal–Dual Hybrid Gradient Methods for Saddle-Point Problems Mathematics saddle-point problem stochastic optimization variance reduction asynchronous parallelism |
| title | Stochastic Variance Reduced Primal–Dual Hybrid Gradient Methods for Saddle-Point Problems |
| title_full | Stochastic Variance Reduced Primal–Dual Hybrid Gradient Methods for Saddle-Point Problems |
| title_fullStr | Stochastic Variance Reduced Primal–Dual Hybrid Gradient Methods for Saddle-Point Problems |
| title_full_unstemmed | Stochastic Variance Reduced Primal–Dual Hybrid Gradient Methods for Saddle-Point Problems |
| title_short | Stochastic Variance Reduced Primal–Dual Hybrid Gradient Methods for Saddle-Point Problems |
| title_sort | stochastic variance reduced primal dual hybrid gradient methods for saddle point problems |
| topic | saddle-point problem stochastic optimization variance reduction asynchronous parallelism |
| url | https://www.mdpi.com/2227-7390/13/10/1687 |
| work_keys_str_mv | AT weixinan stochasticvariancereducedprimaldualhybridgradientmethodsforsaddlepointproblems AT yuanyuanliu stochasticvariancereducedprimaldualhybridgradientmethodsforsaddlepointproblems AT fanhuashang stochasticvariancereducedprimaldualhybridgradientmethodsforsaddlepointproblems AT hongyingliu stochasticvariancereducedprimaldualhybridgradientmethodsforsaddlepointproblems |