A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System

This study investigates the ability of well-known deep learning models, such as ResNet and EfficientNet, to perform audio-based infant cry detection. By comparing the performance of different machine learning algorithms, this study seeks to determine the most effective approach for the detection of...

Full description

Saved in:
Bibliographic Details
Main Authors: Denisa Maria Herlea, Bogdan Iancu, Eugen-Richard Ardelean
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Informatics
Subjects:
Online Access:https://www.mdpi.com/2227-9709/12/2/50
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850168371033669632
author Denisa Maria Herlea
Bogdan Iancu
Eugen-Richard Ardelean
author_facet Denisa Maria Herlea
Bogdan Iancu
Eugen-Richard Ardelean
author_sort Denisa Maria Herlea
collection DOAJ
description This study investigates the ability of well-known deep learning models, such as ResNet and EfficientNet, to perform audio-based infant cry detection. By comparing the performance of different machine learning algorithms, this study seeks to determine the most effective approach for the detection of infant crying, enhancing the functionality of baby monitoring systems and contributing to a more advanced understanding of audio-based deep learning applications. Understanding and accurately detecting a baby’s cries is crucial for ensuring their safety and well-being, a concern shared by new and expecting parents worldwide. Despite advancements in child health, as noted by UNICEF’s 2022 report of the lowest ever recorded child mortality rate, there is still room for technological improvement. This paper presents a comprehensive evaluation of deep learning models for infant cry detection, analyzing the performance of various architectures on spectrogram and MFCC feature representations. A key focus is the comparison between pretrained and non-pretrained models, assessing their ability to generalize across diverse audio environments. Through extensive experimentation, ResNet50 and DenseNet trained on spectrograms emerged as the most effective architectures, significantly outperforming other models in classification accuracy. Additionally, the study investigates the impact of feature extraction techniques, dataset augmentation, and model fine-tuning, providing deeper insights into the role of representation learning in audio classification. The findings contribute to the growing field of audio-based deep learning applications, offering a detailed comparative study of model architectures, feature representations, and training strategies for infant cry detection.
format Article
id doaj-art-56b2fd401fb245e99ea9cd4fb4a1adcd
institution OA Journals
issn 2227-9709
language English
publishDate 2025-05-01
publisher MDPI AG
record_format Article
series Informatics
spelling doaj-art-56b2fd401fb245e99ea9cd4fb4a1adcd2025-08-20T02:20:58ZengMDPI AGInformatics2227-97092025-05-011225010.3390/informatics12020050A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring SystemDenisa Maria Herlea0Bogdan Iancu1Eugen-Richard Ardelean2Computer Science Department, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, RomaniaComputer Science Department, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, RomaniaComputer Science Department, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, RomaniaThis study investigates the ability of well-known deep learning models, such as ResNet and EfficientNet, to perform audio-based infant cry detection. By comparing the performance of different machine learning algorithms, this study seeks to determine the most effective approach for the detection of infant crying, enhancing the functionality of baby monitoring systems and contributing to a more advanced understanding of audio-based deep learning applications. Understanding and accurately detecting a baby’s cries is crucial for ensuring their safety and well-being, a concern shared by new and expecting parents worldwide. Despite advancements in child health, as noted by UNICEF’s 2022 report of the lowest ever recorded child mortality rate, there is still room for technological improvement. This paper presents a comprehensive evaluation of deep learning models for infant cry detection, analyzing the performance of various architectures on spectrogram and MFCC feature representations. A key focus is the comparison between pretrained and non-pretrained models, assessing their ability to generalize across diverse audio environments. Through extensive experimentation, ResNet50 and DenseNet trained on spectrograms emerged as the most effective architectures, significantly outperforming other models in classification accuracy. Additionally, the study investigates the impact of feature extraction techniques, dataset augmentation, and model fine-tuning, providing deeper insights into the role of representation learning in audio classification. The findings contribute to the growing field of audio-based deep learning applications, offering a detailed comparative study of model architectures, feature representations, and training strategies for infant cry detection.https://www.mdpi.com/2227-9709/12/2/50deep learningconvolutional neural networkclassificationinfant cryingResNetEfficientNet
spellingShingle Denisa Maria Herlea
Bogdan Iancu
Eugen-Richard Ardelean
A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System
Informatics
deep learning
convolutional neural network
classification
infant crying
ResNet
EfficientNet
title A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System
title_full A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System
title_fullStr A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System
title_full_unstemmed A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System
title_short A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System
title_sort study of deep learning models for audio classification of infant crying in a baby monitoring system
topic deep learning
convolutional neural network
classification
infant crying
ResNet
EfficientNet
url https://www.mdpi.com/2227-9709/12/2/50
work_keys_str_mv AT denisamariaherlea astudyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem
AT bogdaniancu astudyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem
AT eugenrichardardelean astudyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem
AT denisamariaherlea studyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem
AT bogdaniancu studyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem
AT eugenrichardardelean studyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem