Efficient Deep Learning Job Allocation in Cloud Systems by Predicting Resource Consumptions including GPU and CPU

One objective of GPU scheduling in cloud systems is to minimize the completion times of given deep learning models. This is important for deep learning in cloud environments because deep learning workloads require a lot of time to finish, and misallocation of these workloads can create a huge increa...

Full description

Saved in:

Bibliographic Details
Main Authors:	Abuda Chad Ferrino, Tae Young Choe
Format:	Article
Language:	English
Published:	University North 2025-01-01
Series:	Tehnički Glasnik
Subjects:	cloud computing convolutional neural network deep learning GPU job scheduling performance estimation
Online Access:	https://hrcak.srce.hr/file/480464
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850103067463122944
author	Abuda Chad Ferrino Tae Young Choe
author_facet	Abuda Chad Ferrino Tae Young Choe
author_sort	Abuda Chad Ferrino
collection	DOAJ
description	One objective of GPU scheduling in cloud systems is to minimize the completion times of given deep learning models. This is important for deep learning in cloud environments because deep learning workloads require a lot of time to finish, and misallocation of these workloads can create a huge increase in job completion time. Difficulties of GPU scheduling come from a diverse type of parameters including model architectures and GPU types. Some of these model architectures are CPU-intensive rather than GPU-intensive which creates a different hardware requirement when training different models. The previous GPU scheduling research had used a small set of parameters, which did not include CPU parameters, which made it difficult to reduce the job completion time (JCT). This paper introduces an improved GPU scheduling approach that reduces job completion time by predicting execution time and various resource consumption parameters including GPU Utilization%, GPU Memory Utilization%, GPU Memory, and CPU Utilization%. The experimental results show that the proposed model improves JCT by up to 40.9% on GPU Allocation based on Computing Efficiency compared to Driple.
format	Article
id	doaj-art-0c2e900834784fd0a1ce9c9fdaeafc41
institution	DOAJ
issn	1846-6168 1848-5588
language	English
publishDate	2025-01-01
publisher	University North
record_format	Article
series	Tehnički Glasnik
spelling	doaj-art-0c2e900834784fd0a1ce9c9fdaeafc412025-08-20T02:39:38ZengUniversity NorthTehnički Glasnik1846-61681848-55882025-01-0119346147210.31803/tg-20240112104444Efficient Deep Learning Job Allocation in Cloud Systems by Predicting Resource Consumptions including GPU and CPUAbuda Chad Ferrino0Tae Young Choe1Department of Computer AI Convergence Engineering, Kumoh National Institute of Technology, 61 Daehak-ro, Gumi-si, Gyeongsangbuk-do, 39177, Republic of KoreaDepartment of Computer AI Convergence Engineering, Kumoh National Institute of Technology, 61 Daehak-ro, Gumi-si, Gyeongsangbuk-do, 39177, Republic of KoreaOne objective of GPU scheduling in cloud systems is to minimize the completion times of given deep learning models. This is important for deep learning in cloud environments because deep learning workloads require a lot of time to finish, and misallocation of these workloads can create a huge increase in job completion time. Difficulties of GPU scheduling come from a diverse type of parameters including model architectures and GPU types. Some of these model architectures are CPU-intensive rather than GPU-intensive which creates a different hardware requirement when training different models. The previous GPU scheduling research had used a small set of parameters, which did not include CPU parameters, which made it difficult to reduce the job completion time (JCT). This paper introduces an improved GPU scheduling approach that reduces job completion time by predicting execution time and various resource consumption parameters including GPU Utilization%, GPU Memory Utilization%, GPU Memory, and CPU Utilization%. The experimental results show that the proposed model improves JCT by up to 40.9% on GPU Allocation based on Computing Efficiency compared to Driple.https://hrcak.srce.hr/file/480464cloud computingconvolutional neural networkdeep learningGPU job schedulingperformance estimation
spellingShingle	Abuda Chad Ferrino Tae Young Choe Efficient Deep Learning Job Allocation in Cloud Systems by Predicting Resource Consumptions including GPU and CPU Tehnički Glasnik cloud computing convolutional neural network deep learning GPU job scheduling performance estimation
title	Efficient Deep Learning Job Allocation in Cloud Systems by Predicting Resource Consumptions including GPU and CPU
title_full	Efficient Deep Learning Job Allocation in Cloud Systems by Predicting Resource Consumptions including GPU and CPU
title_fullStr	Efficient Deep Learning Job Allocation in Cloud Systems by Predicting Resource Consumptions including GPU and CPU
title_full_unstemmed	Efficient Deep Learning Job Allocation in Cloud Systems by Predicting Resource Consumptions including GPU and CPU
title_short	Efficient Deep Learning Job Allocation in Cloud Systems by Predicting Resource Consumptions including GPU and CPU
title_sort	efficient deep learning job allocation in cloud systems by predicting resource consumptions including gpu and cpu
topic	cloud computing convolutional neural network deep learning GPU job scheduling performance estimation
url	https://hrcak.srce.hr/file/480464
work_keys_str_mv	AT abudachadferrino efficientdeeplearningjoballocationincloudsystemsbypredictingresourceconsumptionsincludinggpuandcpu AT taeyoungchoe efficientdeeplearningjoballocationincloudsystemsbypredictingresourceconsumptionsincludinggpuandcpu

Efficient Deep Learning Job Allocation in Cloud Systems by Predicting Resource Consumptions including GPU and CPU

Similar Items