Evaluating DL Model Scaling Trade-Offs During Inference via an Empirical Benchmark Analysis

With generative Artificial Intelligence (AI) capturing public attention, the appetite of the technology sector for larger and more complex Deep Learning (DL) models is continuously growing. Traditionally, the focus in DL model development has been on scaling the neural network’s foundational structu...

Full description

Saved in:

Bibliographic Details
Main Authors:	Demetris Trihinas, Panagiotis Michael, Moysis Symeonides
Format:	Article
Language:	English
Published:	MDPI AG 2024-12-01
Series:	Future Internet
Subjects:	deep learning artificial intelligence cloud computing benchmarking
Online Access:	https://www.mdpi.com/1999-5903/16/12/468
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850050337270923264
author	Demetris Trihinas Panagiotis Michael Moysis Symeonides
author_facet	Demetris Trihinas Panagiotis Michael Moysis Symeonides
author_sort	Demetris Trihinas
collection	DOAJ
description	With generative Artificial Intelligence (AI) capturing public attention, the appetite of the technology sector for larger and more complex Deep Learning (DL) models is continuously growing. Traditionally, the focus in DL model development has been on scaling the neural network’s foundational structure to increase computational complexity and enhance the representational expressiveness of the model. However, with recent advancements in edge computing and 5G networks, DL models are now aggressively being deployed and utilized across the cloud–edge–IoT continuum for the realization of in situ intelligent IoT services. This paradigm shift introduces a growing need for AI practitioners, as a focus on inference costs, including latency, computational overhead, and energy efficiency, is long overdue. This work presents a benchmarking framework designed to assess DL model scaling across three key performance axes during model inference: classification accuracy, computational overhead, and latency. The framework’s utility is demonstrated through an empirical study involving various model structures and variants, as well as publicly available datasets for three popular DL use cases covering natural language understanding, object detection, and regression analysis.
format	Article
id	doaj-art-4633c52bac2a49e1813d206afb73cf74
institution	DOAJ
issn	1999-5903
language	English
publishDate	2024-12-01
publisher	MDPI AG
record_format	Article
series	Future Internet
spelling	doaj-art-4633c52bac2a49e1813d206afb73cf742025-08-20T02:53:30ZengMDPI AGFuture Internet1999-59032024-12-01161246810.3390/fi16120468Evaluating DL Model Scaling Trade-Offs During Inference via an Empirical Benchmark AnalysisDemetris Trihinas0Panagiotis Michael1Moysis Symeonides2Department of Computer Science, School of Sciences and Engineering, University of Nicosia, Nicosia CY-2417, CyprusDepartment of Computer Science, School of Sciences and Engineering, University of Nicosia, Nicosia CY-2417, CyprusDepartment of Computer Science, University of Cyprus, Nicosia CY-2109, CyprusWith generative Artificial Intelligence (AI) capturing public attention, the appetite of the technology sector for larger and more complex Deep Learning (DL) models is continuously growing. Traditionally, the focus in DL model development has been on scaling the neural network’s foundational structure to increase computational complexity and enhance the representational expressiveness of the model. However, with recent advancements in edge computing and 5G networks, DL models are now aggressively being deployed and utilized across the cloud–edge–IoT continuum for the realization of in situ intelligent IoT services. This paradigm shift introduces a growing need for AI practitioners, as a focus on inference costs, including latency, computational overhead, and energy efficiency, is long overdue. This work presents a benchmarking framework designed to assess DL model scaling across three key performance axes during model inference: classification accuracy, computational overhead, and latency. The framework’s utility is demonstrated through an empirical study involving various model structures and variants, as well as publicly available datasets for three popular DL use cases covering natural language understanding, object detection, and regression analysis.https://www.mdpi.com/1999-5903/16/12/468deep learningartificial intelligencecloud computingbenchmarking
spellingShingle	Demetris Trihinas Panagiotis Michael Moysis Symeonides Evaluating DL Model Scaling Trade-Offs During Inference via an Empirical Benchmark Analysis Future Internet deep learning artificial intelligence cloud computing benchmarking
title	Evaluating DL Model Scaling Trade-Offs During Inference via an Empirical Benchmark Analysis
title_full	Evaluating DL Model Scaling Trade-Offs During Inference via an Empirical Benchmark Analysis
title_fullStr	Evaluating DL Model Scaling Trade-Offs During Inference via an Empirical Benchmark Analysis
title_full_unstemmed	Evaluating DL Model Scaling Trade-Offs During Inference via an Empirical Benchmark Analysis
title_short	Evaluating DL Model Scaling Trade-Offs During Inference via an Empirical Benchmark Analysis
title_sort	evaluating dl model scaling trade offs during inference via an empirical benchmark analysis
topic	deep learning artificial intelligence cloud computing benchmarking
url	https://www.mdpi.com/1999-5903/16/12/468
work_keys_str_mv	AT demetristrihinas evaluatingdlmodelscalingtradeoffsduringinferenceviaanempiricalbenchmarkanalysis AT panagiotismichael evaluatingdlmodelscalingtradeoffsduringinferenceviaanempiricalbenchmarkanalysis AT moysissymeonides evaluatingdlmodelscalingtradeoffsduringinferenceviaanempiricalbenchmarkanalysis

Evaluating DL Model Scaling Trade-Offs During Inference via an Empirical Benchmark Analysis

Similar Items