Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face Analysis

Deep Neural Networks (DNNs) have been widely adopted in several advanced artificial intelligence applications due to their competitive accuracy to the human brain. Nevertheless, the superior accuracy of a DNN is achieved at the expense of intensive computations and storage complexity, requiring cust...

Full description

Saved in:
Bibliographic Details
Main Authors: Faris S. Alghareb, Balqees Talal Hasan
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Computers
Subjects:
Online Access:https://www.mdpi.com/2073-431X/14/1/29
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832588744375402496
author Faris S. Alghareb
Balqees Talal Hasan
author_facet Faris S. Alghareb
Balqees Talal Hasan
author_sort Faris S. Alghareb
collection DOAJ
description Deep Neural Networks (DNNs) have been widely adopted in several advanced artificial intelligence applications due to their competitive accuracy to the human brain. Nevertheless, the superior accuracy of a DNN is achieved at the expense of intensive computations and storage complexity, requiring custom expandable hardware, i.e., graphics processing units (GPUs). Interestingly, leveraging the synergy of parallelism and edge computing can significantly improve CPU-based hardware platforms. Therefore, this manuscript explores levels of parallelism techniques along with edge computation offloading to develop an innovative hardware platform that improves the efficacy of deep learning computing architectures. Furthermore, the multitask learning (MTL) approach is employed to construct a parallel multi-task classification network. These tasks include face detection and recognition, age estimation, gender recognition, smile detection, and hair color and style classification. Additionally, both pipeline and parallel processing techniques are utilized to expedite complicated computations, boosting the overall performance of the presented deep face analysis architecture. A computation offloading approach, on the other hand, is leveraged to distribute computation-intensive tasks to the server edge, whereas lightweight computations are offloaded to edge devices, i.e., Raspberry Pi 4. To train the proposed deep face analysis network architecture, two custom datasets (HDDB and FRAED) were created for head detection and face-age recognition. Extensive experimental results demonstrate the efficacy of the proposed pipeline-parallel architecture in terms of execution time. It requires 8.2 s to provide detailed face detection and analysis for an individual and 23.59 s for an inference containing 10 individuals. Moreover, a speedup of 62.48% is achieved compared to the sequential-based edge computing architecture. Meanwhile, 25.96% speed performance acceleration is realized when implementing the proposed pipeline-parallel architecture only on the server edge compared to the sever sequential implementation. Considering classification efficiency, the proposed classification modules achieve an accuracy of 88.55% for hair color and style classification and a remarkable prediction outcome of 100% for face recognition and age estimation. To summarize, the proposed approach can assist in reducing the required execution time and memory capacity by processing all facial tasks simultaneously on a single deep neural network rather than building a CNN model for each task. Therefore, the presented pipeline-parallel architecture can be a cost-effective framework for real-time computer vision applications implemented on resource-limited devices.
format Article
id doaj-art-0ac6e3124be34e2da3c4472d2934dce4
institution Kabale University
issn 2073-431X
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Computers
spelling doaj-art-0ac6e3124be34e2da3c4472d2934dce42025-01-24T13:27:55ZengMDPI AGComputers2073-431X2025-01-011412910.3390/computers14010029Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face AnalysisFaris S. Alghareb0Balqees Talal Hasan1Department of Computer and Informatics Engineering, College of Electronics, Ninevah University, Mosul 41002, IraqDepartment of Computer Networks and the Internet, College of Information Technology, Ninevah University, Mosul 41002, IraqDeep Neural Networks (DNNs) have been widely adopted in several advanced artificial intelligence applications due to their competitive accuracy to the human brain. Nevertheless, the superior accuracy of a DNN is achieved at the expense of intensive computations and storage complexity, requiring custom expandable hardware, i.e., graphics processing units (GPUs). Interestingly, leveraging the synergy of parallelism and edge computing can significantly improve CPU-based hardware platforms. Therefore, this manuscript explores levels of parallelism techniques along with edge computation offloading to develop an innovative hardware platform that improves the efficacy of deep learning computing architectures. Furthermore, the multitask learning (MTL) approach is employed to construct a parallel multi-task classification network. These tasks include face detection and recognition, age estimation, gender recognition, smile detection, and hair color and style classification. Additionally, both pipeline and parallel processing techniques are utilized to expedite complicated computations, boosting the overall performance of the presented deep face analysis architecture. A computation offloading approach, on the other hand, is leveraged to distribute computation-intensive tasks to the server edge, whereas lightweight computations are offloaded to edge devices, i.e., Raspberry Pi 4. To train the proposed deep face analysis network architecture, two custom datasets (HDDB and FRAED) were created for head detection and face-age recognition. Extensive experimental results demonstrate the efficacy of the proposed pipeline-parallel architecture in terms of execution time. It requires 8.2 s to provide detailed face detection and analysis for an individual and 23.59 s for an inference containing 10 individuals. Moreover, a speedup of 62.48% is achieved compared to the sequential-based edge computing architecture. Meanwhile, 25.96% speed performance acceleration is realized when implementing the proposed pipeline-parallel architecture only on the server edge compared to the sever sequential implementation. Considering classification efficiency, the proposed classification modules achieve an accuracy of 88.55% for hair color and style classification and a remarkable prediction outcome of 100% for face recognition and age estimation. To summarize, the proposed approach can assist in reducing the required execution time and memory capacity by processing all facial tasks simultaneously on a single deep neural network rather than building a CNN model for each task. Therefore, the presented pipeline-parallel architecture can be a cost-effective framework for real-time computer vision applications implemented on resource-limited devices.https://www.mdpi.com/2073-431X/14/1/29computation offloadingdeep learningCNNedge computingface recognitionparallelism
spellingShingle Faris S. Alghareb
Balqees Talal Hasan
Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face Analysis
Computers
computation offloading
deep learning
CNN
edge computing
face recognition
parallelism
title Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face Analysis
title_full Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face Analysis
title_fullStr Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face Analysis
title_full_unstemmed Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face Analysis
title_short Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face Analysis
title_sort multitask learning based pipeline parallel computation offloading architecture for deep face analysis
topic computation offloading
deep learning
CNN
edge computing
face recognition
parallelism
url https://www.mdpi.com/2073-431X/14/1/29
work_keys_str_mv AT farissalghareb multitasklearningbasedpipelineparallelcomputationoffloadingarchitecturefordeepfaceanalysis
AT balqeestalalhasan multitasklearningbasedpipelineparallelcomputationoffloadingarchitecturefordeepfaceanalysis