Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face Analysis

Deep Neural Networks (DNNs) have been widely adopted in several advanced artificial intelligence applications due to their competitive accuracy to the human brain. Nevertheless, the superior accuracy of a DNN is achieved at the expense of intensive computations and storage complexity, requiring cust...

Full description

Saved in:

Bibliographic Details
Main Authors:	Faris S. Alghareb, Balqees Talal Hasan
Format:	Article
Language:	English
Published:	MDPI AG 2025-01-01
Series:	Computers
Subjects:	computation offloading deep learning CNN edge computing face recognition parallelism
Online Access:	https://www.mdpi.com/2073-431X/14/1/29
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850098449226137600
author	Faris S. Alghareb Balqees Talal Hasan
author_facet	Faris S. Alghareb Balqees Talal Hasan
author_sort	Faris S. Alghareb
collection	DOAJ
description	Deep Neural Networks (DNNs) have been widely adopted in several advanced artificial intelligence applications due to their competitive accuracy to the human brain. Nevertheless, the superior accuracy of a DNN is achieved at the expense of intensive computations and storage complexity, requiring custom expandable hardware, i.e., graphics processing units (GPUs). Interestingly, leveraging the synergy of parallelism and edge computing can significantly improve CPU-based hardware platforms. Therefore, this manuscript explores levels of parallelism techniques along with edge computation offloading to develop an innovative hardware platform that improves the efficacy of deep learning computing architectures. Furthermore, the multitask learning (MTL) approach is employed to construct a parallel multi-task classification network. These tasks include face detection and recognition, age estimation, gender recognition, smile detection, and hair color and style classification. Additionally, both pipeline and parallel processing techniques are utilized to expedite complicated computations, boosting the overall performance of the presented deep face analysis architecture. A computation offloading approach, on the other hand, is leveraged to distribute computation-intensive tasks to the server edge, whereas lightweight computations are offloaded to edge devices, i.e., Raspberry Pi 4. To train the proposed deep face analysis network architecture, two custom datasets (HDDB and FRAED) were created for head detection and face-age recognition. Extensive experimental results demonstrate the efficacy of the proposed pipeline-parallel architecture in terms of execution time. It requires 8.2 s to provide detailed face detection and analysis for an individual and 23.59 s for an inference containing 10 individuals. Moreover, a speedup of 62.48% is achieved compared to the sequential-based edge computing architecture. Meanwhile, 25.96% speed performance acceleration is realized when implementing the proposed pipeline-parallel architecture only on the server edge compared to the sever sequential implementation. Considering classification efficiency, the proposed classification modules achieve an accuracy of 88.55% for hair color and style classification and a remarkable prediction outcome of 100% for face recognition and age estimation. To summarize, the proposed approach can assist in reducing the required execution time and memory capacity by processing all facial tasks simultaneously on a single deep neural network rather than building a CNN model for each task. Therefore, the presented pipeline-parallel architecture can be a cost-effective framework for real-time computer vision applications implemented on resource-limited devices.
format	Article
id	doaj-art-0ac6e3124be34e2da3c4472d2934dce4
institution	DOAJ
issn	2073-431X
language	English
publishDate	2025-01-01
publisher	MDPI AG
record_format	Article
series	Computers
spelling	doaj-art-0ac6e3124be34e2da3c4472d2934dce42025-08-20T02:40:43ZengMDPI AGComputers2073-431X2025-01-011412910.3390/computers14010029Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face AnalysisFaris S. Alghareb0Balqees Talal Hasan1Department of Computer and Informatics Engineering, College of Electronics, Ninevah University, Mosul 41002, IraqDepartment of Computer Networks and the Internet, College of Information Technology, Ninevah University, Mosul 41002, IraqDeep Neural Networks (DNNs) have been widely adopted in several advanced artificial intelligence applications due to their competitive accuracy to the human brain. Nevertheless, the superior accuracy of a DNN is achieved at the expense of intensive computations and storage complexity, requiring custom expandable hardware, i.e., graphics processing units (GPUs). Interestingly, leveraging the synergy of parallelism and edge computing can significantly improve CPU-based hardware platforms. Therefore, this manuscript explores levels of parallelism techniques along with edge computation offloading to develop an innovative hardware platform that improves the efficacy of deep learning computing architectures. Furthermore, the multitask learning (MTL) approach is employed to construct a parallel multi-task classification network. These tasks include face detection and recognition, age estimation, gender recognition, smile detection, and hair color and style classification. Additionally, both pipeline and parallel processing techniques are utilized to expedite complicated computations, boosting the overall performance of the presented deep face analysis architecture. A computation offloading approach, on the other hand, is leveraged to distribute computation-intensive tasks to the server edge, whereas lightweight computations are offloaded to edge devices, i.e., Raspberry Pi 4. To train the proposed deep face analysis network architecture, two custom datasets (HDDB and FRAED) were created for head detection and face-age recognition. Extensive experimental results demonstrate the efficacy of the proposed pipeline-parallel architecture in terms of execution time. It requires 8.2 s to provide detailed face detection and analysis for an individual and 23.59 s for an inference containing 10 individuals. Moreover, a speedup of 62.48% is achieved compared to the sequential-based edge computing architecture. Meanwhile, 25.96% speed performance acceleration is realized when implementing the proposed pipeline-parallel architecture only on the server edge compared to the sever sequential implementation. Considering classification efficiency, the proposed classification modules achieve an accuracy of 88.55% for hair color and style classification and a remarkable prediction outcome of 100% for face recognition and age estimation. To summarize, the proposed approach can assist in reducing the required execution time and memory capacity by processing all facial tasks simultaneously on a single deep neural network rather than building a CNN model for each task. Therefore, the presented pipeline-parallel architecture can be a cost-effective framework for real-time computer vision applications implemented on resource-limited devices.https://www.mdpi.com/2073-431X/14/1/29computation offloadingdeep learningCNNedge computingface recognitionparallelism
spellingShingle	Faris S. Alghareb Balqees Talal Hasan Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face Analysis Computers computation offloading deep learning CNN edge computing face recognition parallelism
title	Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face Analysis
title_full	Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face Analysis
title_fullStr	Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face Analysis
title_full_unstemmed	Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face Analysis
title_short	Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face Analysis
title_sort	multitask learning based pipeline parallel computation offloading architecture for deep face analysis
topic	computation offloading deep learning CNN edge computing face recognition parallelism
url	https://www.mdpi.com/2073-431X/14/1/29
work_keys_str_mv	AT farissalghareb multitasklearningbasedpipelineparallelcomputationoffloadingarchitecturefordeepfaceanalysis AT balqeestalalhasan multitasklearningbasedpipelineparallelcomputationoffloadingarchitecturefordeepfaceanalysis

Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face Analysis

Similar Items