Enhancing the Prediction of Episodes of Aggression in Patients with Dementia Using Audio-Based Detection: A Multimodal Late Fusion Approach with a Meta-Classifier

This study presents an enhancement in the prediction of aggressive outbursts in dementia patients from our previous work, by integrating audio-based violence detection into our previous visual-based aggressive body movement detections. By combining audio and visual information, we aim to further enh...

Full description

Saved in:
Bibliographic Details
Main Authors: Ioannis Galanakis, Rigas Filippos Soldatos, Nikitas Karanikolas, Athanasios Voulodimos, Ioannis Voyiatzis, Maria Samarakou
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/10/5351
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This study presents an enhancement in the prediction of aggressive outbursts in dementia patients from our previous work, by integrating audio-based violence detection into our previous visual-based aggressive body movement detections. By combining audio and visual information, we aim to further enhance the model’s capabilities and make it more suitable for real-world scenario applications. This current work utilizes an audio dataset, containing various audio segments capturing vocal expressions during aggressive and non-aggressive scenarios. Various noise-filtering techniques were performed on the audio files using Mel-frequency cepstral coefficients (MFCCs), frequency filtering, and speech prosody to extract clear information from the audio features. Furthermore, we perform a late fusion rule to merge the predictions of the two models into a unified trained meta-classifier to determine the further improvement of the model with the audio integrated into it with a higher aim for a more precise and multimodal approach in detecting and predicting aggressive outburst behavior in patients suffering from dementia. The analysis of the correlations in our multimodal approach suggests that the accuracy of the early detection models is improved, providing a novel proof of concept with the appropriate findings to advance the understanding of aggression prediction in clinical settings and offer more effective intervention tactics from caregivers.
ISSN:2076-3417