Curvature-Adaptive Learning Rate Optimizer: Theoretical Insights and Empirical Evaluation on Neural Network Training
Optimizing neural networks often encounters challenges such as saddle points, plateaus, and ill-conditioned curvature, limiting the effectiveness of standard optimizers like Adam, Nadam, and RMSProp. To address these limitations, we propose the Curvature-Adaptive Learning Rate (CALR) optimizer, a n...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
LibraryPress@UF
2025-05-01
|
| Series: | Proceedings of the International Florida Artificial Intelligence Research Society Conference |
| Online Access: | https://journals.flvc.org/FLAIRS/article/view/138986 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850276979074400256 |
|---|---|
| author | Kehelwala Dewage Gayan Maduranga |
| author_facet | Kehelwala Dewage Gayan Maduranga |
| author_sort | Kehelwala Dewage Gayan Maduranga |
| collection | DOAJ |
| description |
Optimizing neural networks often encounters challenges such as saddle points, plateaus, and ill-conditioned curvature, limiting the effectiveness of standard optimizers like Adam, Nadam, and RMSProp. To address these limitations, we propose the Curvature-Adaptive Learning Rate (CALR) optimizer, a novel method that leverages local curvature estimates to dynamically adjust learning rates. CALR, along with its variants incorporating gradient clipping and cosine annealing schedules, offers enhanced robustness and faster convergence across diverse optimization tasks. Theoretical analysis confirms CALR’s convergence properties, while empirical evaluations on benchmark functions—Rosenbrock, Himmelblau, and Saddle Point—highlight its efficiency in complex optimization landscapes. Furthermore, CALR demonstrates superior performance on neural network training tasks using MNIST and CIFAR-10 datasets, achieving faster convergence, lower loss, and better generalization compared to traditional optimizers. These results establish CALR as a promising optimization strategy for challenging neural network training problems.
|
| format | Article |
| id | doaj-art-b46219a33eba42a6ab3616bd40caa422 |
| institution | OA Journals |
| issn | 2334-0754 2334-0762 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | LibraryPress@UF |
| record_format | Article |
| series | Proceedings of the International Florida Artificial Intelligence Research Society Conference |
| spelling | doaj-art-b46219a33eba42a6ab3616bd40caa4222025-08-20T01:50:01ZengLibraryPress@UFProceedings of the International Florida Artificial Intelligence Research Society Conference2334-07542334-07622025-05-0138110.32473/flairs.38.1.138986Curvature-Adaptive Learning Rate Optimizer: Theoretical Insights and Empirical Evaluation on Neural Network TrainingKehelwala Dewage Gayan Maduranga0Tennessee Technological University Optimizing neural networks often encounters challenges such as saddle points, plateaus, and ill-conditioned curvature, limiting the effectiveness of standard optimizers like Adam, Nadam, and RMSProp. To address these limitations, we propose the Curvature-Adaptive Learning Rate (CALR) optimizer, a novel method that leverages local curvature estimates to dynamically adjust learning rates. CALR, along with its variants incorporating gradient clipping and cosine annealing schedules, offers enhanced robustness and faster convergence across diverse optimization tasks. Theoretical analysis confirms CALR’s convergence properties, while empirical evaluations on benchmark functions—Rosenbrock, Himmelblau, and Saddle Point—highlight its efficiency in complex optimization landscapes. Furthermore, CALR demonstrates superior performance on neural network training tasks using MNIST and CIFAR-10 datasets, achieving faster convergence, lower loss, and better generalization compared to traditional optimizers. These results establish CALR as a promising optimization strategy for challenging neural network training problems. https://journals.flvc.org/FLAIRS/article/view/138986 |
| spellingShingle | Kehelwala Dewage Gayan Maduranga Curvature-Adaptive Learning Rate Optimizer: Theoretical Insights and Empirical Evaluation on Neural Network Training Proceedings of the International Florida Artificial Intelligence Research Society Conference |
| title | Curvature-Adaptive Learning Rate Optimizer: Theoretical Insights and Empirical Evaluation on Neural Network Training |
| title_full | Curvature-Adaptive Learning Rate Optimizer: Theoretical Insights and Empirical Evaluation on Neural Network Training |
| title_fullStr | Curvature-Adaptive Learning Rate Optimizer: Theoretical Insights and Empirical Evaluation on Neural Network Training |
| title_full_unstemmed | Curvature-Adaptive Learning Rate Optimizer: Theoretical Insights and Empirical Evaluation on Neural Network Training |
| title_short | Curvature-Adaptive Learning Rate Optimizer: Theoretical Insights and Empirical Evaluation on Neural Network Training |
| title_sort | curvature adaptive learning rate optimizer theoretical insights and empirical evaluation on neural network training |
| url | https://journals.flvc.org/FLAIRS/article/view/138986 |
| work_keys_str_mv | AT kehelwaladewagegayanmaduranga curvatureadaptivelearningrateoptimizertheoreticalinsightsandempiricalevaluationonneuralnetworktraining |