Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models

Determining the ideal architecture for deep learning models, such as the number of layers and neurons, is a difficult and resource-intensive process that frequently relies on human tuning or computationally costly optimization approaches. While Particle Swarm Optimization (PSO) and Large Language Mo...

Full description

Saved in:
Bibliographic Details
Main Authors: Saad Hameed, Basheer Qolomany, Samir Brahim Belhaouari, Mohamed Abdallah, Junaid Qadir, Ala Al-Fuqaha
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Open Journal of the Computer Society
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10976715/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850275963476115456
author Saad Hameed
Basheer Qolomany
Samir Brahim Belhaouari
Mohamed Abdallah
Junaid Qadir
Ala Al-Fuqaha
author_facet Saad Hameed
Basheer Qolomany
Samir Brahim Belhaouari
Mohamed Abdallah
Junaid Qadir
Ala Al-Fuqaha
author_sort Saad Hameed
collection DOAJ
description Determining the ideal architecture for deep learning models, such as the number of layers and neurons, is a difficult and resource-intensive process that frequently relies on human tuning or computationally costly optimization approaches. While Particle Swarm Optimization (PSO) and Large Language Models (LLMs) have been individually applied in optimization and deep learning, their combined use for enhancing convergence in numerical optimization tasks remains underexplored. Our work addresses this gap by integrating LLMs into PSO to reduce model evaluations and improve convergence for deep learning hyperparameter tuning. The proposed LLM-enhanced PSO method addresses the difficulties of efficiency and convergence by using LLMs (particularly ChatGPT-3.5 and Llama3) to improve PSO performance, allowing for faster achievement of target objectives. Our method speeds up search space exploration by substituting underperforming particle placements with best suggestions offered by LLMs. Comprehensive experiments across three scenarios—(1) optimizing the Rastrigin function, (2) using Long Short-Term Memory (LSTM) networks for time series regression, and (3) using Convolutional Neural Networks (CNNs) for material classification—show that the method significantly improves convergence rates and lowers computational costs. Depending on the application, computational complexity is lowered by 20% to 60% compared to traditional PSO methods. Llama3 achieved a 20% to 40% reduction in model calls for regression tasks, whereas ChatGPT-3.5 reduced model calls by 60% for both regression and classification tasks, all while preserving accuracy and error rates. This groundbreaking methodology offers a very efficient and effective solution for optimizing deep learning models, leading to substantial computational performance improvements across a wide range of applications.
format Article
id doaj-art-a17cbfe57b2540318ef06814b6d13a2c
institution OA Journals
issn 2644-1268
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Open Journal of the Computer Society
spelling doaj-art-a17cbfe57b2540318ef06814b6d13a2c2025-08-20T01:50:29ZengIEEEIEEE Open Journal of the Computer Society2644-12682025-01-01657458510.1109/OJCS.2025.356449310976715Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning ModelsSaad Hameed0https://orcid.org/0000-0003-4175-9489Basheer Qolomany1https://orcid.org/0000-0002-3270-7225Samir Brahim Belhaouari2https://orcid.org/0000-0003-2336-0490Mohamed Abdallah3https://orcid.org/0000-0002-3261-7588Junaid Qadir4https://orcid.org/0000-0001-9466-2475Ala Al-Fuqaha5https://orcid.org/0000-0002-0903-1204Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, QatarDepartment of Medicine, College of Medicine, Howard University, Washington, D.C., USADivision of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, QatarDivision of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, QatarDepartment of Computer Science and Engineering, Qatar University, Doha, QatarDivision of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, QatarDetermining the ideal architecture for deep learning models, such as the number of layers and neurons, is a difficult and resource-intensive process that frequently relies on human tuning or computationally costly optimization approaches. While Particle Swarm Optimization (PSO) and Large Language Models (LLMs) have been individually applied in optimization and deep learning, their combined use for enhancing convergence in numerical optimization tasks remains underexplored. Our work addresses this gap by integrating LLMs into PSO to reduce model evaluations and improve convergence for deep learning hyperparameter tuning. The proposed LLM-enhanced PSO method addresses the difficulties of efficiency and convergence by using LLMs (particularly ChatGPT-3.5 and Llama3) to improve PSO performance, allowing for faster achievement of target objectives. Our method speeds up search space exploration by substituting underperforming particle placements with best suggestions offered by LLMs. Comprehensive experiments across three scenarios—(1) optimizing the Rastrigin function, (2) using Long Short-Term Memory (LSTM) networks for time series regression, and (3) using Convolutional Neural Networks (CNNs) for material classification—show that the method significantly improves convergence rates and lowers computational costs. Depending on the application, computational complexity is lowered by 20% to 60% compared to traditional PSO methods. Llama3 achieved a 20% to 40% reduction in model calls for regression tasks, whereas ChatGPT-3.5 reduced model calls by 60% for both regression and classification tasks, all while preserving accuracy and error rates. This groundbreaking methodology offers a very efficient and effective solution for optimizing deep learning models, leading to substantial computational performance improvements across a wide range of applications.https://ieeexplore.ieee.org/document/10976715/Deep learning optimizationPSOLLMmachine learninghyper-parameter optimization
spellingShingle Saad Hameed
Basheer Qolomany
Samir Brahim Belhaouari
Mohamed Abdallah
Junaid Qadir
Ala Al-Fuqaha
Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
IEEE Open Journal of the Computer Society
Deep learning optimization
PSO
LLM
machine learning
hyper-parameter optimization
title Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
title_full Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
title_fullStr Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
title_full_unstemmed Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
title_short Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
title_sort large language model enhanced particle swarm optimization for hyperparameter tuning for deep learning models
topic Deep learning optimization
PSO
LLM
machine learning
hyper-parameter optimization
url https://ieeexplore.ieee.org/document/10976715/
work_keys_str_mv AT saadhameed largelanguagemodelenhancedparticleswarmoptimizationforhyperparametertuningfordeeplearningmodels
AT basheerqolomany largelanguagemodelenhancedparticleswarmoptimizationforhyperparametertuningfordeeplearningmodels
AT samirbrahimbelhaouari largelanguagemodelenhancedparticleswarmoptimizationforhyperparametertuningfordeeplearningmodels
AT mohamedabdallah largelanguagemodelenhancedparticleswarmoptimizationforhyperparametertuningfordeeplearningmodels
AT junaidqadir largelanguagemodelenhancedparticleswarmoptimizationforhyperparametertuningfordeeplearningmodels
AT alaalfuqaha largelanguagemodelenhancedparticleswarmoptimizationforhyperparametertuningfordeeplearningmodels