A Comprehensive Methodological Survey of Human Activity Recognition Across Diverse Data Modalities
Human Activity Recognition (HAR) systems aim to understand human behavior and assign a label to each action, attracting significant attention in computer vision due to their wide range of applications. HAR can leverage various data modalities, such as RGB images and video, skeleton, depth, infrared,...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Sensors |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1424-8220/25/13/4028 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849319869472833536 |
|---|---|
| author | Jungpil Shin Najmul Hassan Abu Saleh Musa Miah Satoshi Nishimura |
| author_facet | Jungpil Shin Najmul Hassan Abu Saleh Musa Miah Satoshi Nishimura |
| author_sort | Jungpil Shin |
| collection | DOAJ |
| description | Human Activity Recognition (HAR) systems aim to understand human behavior and assign a label to each action, attracting significant attention in computer vision due to their wide range of applications. HAR can leverage various data modalities, such as RGB images and video, skeleton, depth, infrared, point cloud, event stream, audio, acceleration, and radar signals. Each modality provides unique and complementary information suited to different application scenarios. Consequently, numerous studies have investigated diverse approaches for HAR using these modalities. This survey includes only peer-reviewed research papers published in English to ensure linguistic consistency and academic integrity. This paper presents a comprehensive survey of the latest advancements in HAR from 2014 to 2025, focusing on Machine Learning (ML) and Deep Learning (DL) approaches categorized by input data modalities. We review both single-modality and multi-modality techniques, highlighting fusion-based and co-learning frameworks. Additionally, we cover advancements in hand-crafted action features, methods for recognizing human–object interactions, and activity detection. Our survey includes a detailed dataset description for each modality, as well as a summary of the latest HAR systems, accompanied by a mathematical derivation for evaluating the deep learning model for each modality, and it also provides comparative results on benchmark datasets. Finally, we provide insightful observations and propose effective future research directions in HAR. |
| format | Article |
| id | doaj-art-8c4f0f218e5d4d80978db03b27d98a70 |
| institution | Kabale University |
| issn | 1424-8220 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Sensors |
| spelling | doaj-art-8c4f0f218e5d4d80978db03b27d98a702025-08-20T03:50:17ZengMDPI AGSensors1424-82202025-06-012513402810.3390/s25134028A Comprehensive Methodological Survey of Human Activity Recognition Across Diverse Data ModalitiesJungpil Shin0Najmul Hassan1Abu Saleh Musa Miah2Satoshi Nishimura3School of Computer Science and Engineering, The University of Aizu, Aizuwakamatsu 965-8580, JapanSchool of Computer Science and Engineering, The University of Aizu, Aizuwakamatsu 965-8580, JapanSchool of Computer Science and Engineering, The University of Aizu, Aizuwakamatsu 965-8580, JapanSchool of Computer Science and Engineering, The University of Aizu, Aizuwakamatsu 965-8580, JapanHuman Activity Recognition (HAR) systems aim to understand human behavior and assign a label to each action, attracting significant attention in computer vision due to their wide range of applications. HAR can leverage various data modalities, such as RGB images and video, skeleton, depth, infrared, point cloud, event stream, audio, acceleration, and radar signals. Each modality provides unique and complementary information suited to different application scenarios. Consequently, numerous studies have investigated diverse approaches for HAR using these modalities. This survey includes only peer-reviewed research papers published in English to ensure linguistic consistency and academic integrity. This paper presents a comprehensive survey of the latest advancements in HAR from 2014 to 2025, focusing on Machine Learning (ML) and Deep Learning (DL) approaches categorized by input data modalities. We review both single-modality and multi-modality techniques, highlighting fusion-based and co-learning frameworks. Additionally, we cover advancements in hand-crafted action features, methods for recognizing human–object interactions, and activity detection. Our survey includes a detailed dataset description for each modality, as well as a summary of the latest HAR systems, accompanied by a mathematical derivation for evaluating the deep learning model for each modality, and it also provides comparative results on benchmark datasets. Finally, we provide insightful observations and propose effective future research directions in HAR.https://www.mdpi.com/1424-8220/25/13/4028human activity recognition (HAR)diverse modalitydeep learning (DL)machine learning (ML)vision and sensor based HARclassification |
| spellingShingle | Jungpil Shin Najmul Hassan Abu Saleh Musa Miah Satoshi Nishimura A Comprehensive Methodological Survey of Human Activity Recognition Across Diverse Data Modalities Sensors human activity recognition (HAR) diverse modality deep learning (DL) machine learning (ML) vision and sensor based HAR classification |
| title | A Comprehensive Methodological Survey of Human Activity Recognition Across Diverse Data Modalities |
| title_full | A Comprehensive Methodological Survey of Human Activity Recognition Across Diverse Data Modalities |
| title_fullStr | A Comprehensive Methodological Survey of Human Activity Recognition Across Diverse Data Modalities |
| title_full_unstemmed | A Comprehensive Methodological Survey of Human Activity Recognition Across Diverse Data Modalities |
| title_short | A Comprehensive Methodological Survey of Human Activity Recognition Across Diverse Data Modalities |
| title_sort | comprehensive methodological survey of human activity recognition across diverse data modalities |
| topic | human activity recognition (HAR) diverse modality deep learning (DL) machine learning (ML) vision and sensor based HAR classification |
| url | https://www.mdpi.com/1424-8220/25/13/4028 |
| work_keys_str_mv | AT jungpilshin acomprehensivemethodologicalsurveyofhumanactivityrecognitionacrossdiversedatamodalities AT najmulhassan acomprehensivemethodologicalsurveyofhumanactivityrecognitionacrossdiversedatamodalities AT abusalehmusamiah acomprehensivemethodologicalsurveyofhumanactivityrecognitionacrossdiversedatamodalities AT satoshinishimura acomprehensivemethodologicalsurveyofhumanactivityrecognitionacrossdiversedatamodalities AT jungpilshin comprehensivemethodologicalsurveyofhumanactivityrecognitionacrossdiversedatamodalities AT najmulhassan comprehensivemethodologicalsurveyofhumanactivityrecognitionacrossdiversedatamodalities AT abusalehmusamiah comprehensivemethodologicalsurveyofhumanactivityrecognitionacrossdiversedatamodalities AT satoshinishimura comprehensivemethodologicalsurveyofhumanactivityrecognitionacrossdiversedatamodalities |