A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System

This study investigates the ability of well-known deep learning models, such as ResNet and EfficientNet, to perform audio-based infant cry detection. By comparing the performance of different machine learning algorithms, this study seeks to determine the most effective approach for the detection of...

Full description

Saved in:

Bibliographic Details
Main Authors:	Denisa Maria Herlea, Bogdan Iancu, Eugen-Richard Ardelean
Format:	Article
Language:	English
Published:	MDPI AG 2025-05-01
Series:	Informatics
Subjects:	deep learning convolutional neural network classification infant crying ResNet EfficientNet
Online Access:	https://www.mdpi.com/2227-9709/12/2/50
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850168371033669632
author	Denisa Maria Herlea Bogdan Iancu Eugen-Richard Ardelean
author_facet	Denisa Maria Herlea Bogdan Iancu Eugen-Richard Ardelean
author_sort	Denisa Maria Herlea
collection	DOAJ
description	This study investigates the ability of well-known deep learning models, such as ResNet and EfficientNet, to perform audio-based infant cry detection. By comparing the performance of different machine learning algorithms, this study seeks to determine the most effective approach for the detection of infant crying, enhancing the functionality of baby monitoring systems and contributing to a more advanced understanding of audio-based deep learning applications. Understanding and accurately detecting a baby’s cries is crucial for ensuring their safety and well-being, a concern shared by new and expecting parents worldwide. Despite advancements in child health, as noted by UNICEF’s 2022 report of the lowest ever recorded child mortality rate, there is still room for technological improvement. This paper presents a comprehensive evaluation of deep learning models for infant cry detection, analyzing the performance of various architectures on spectrogram and MFCC feature representations. A key focus is the comparison between pretrained and non-pretrained models, assessing their ability to generalize across diverse audio environments. Through extensive experimentation, ResNet50 and DenseNet trained on spectrograms emerged as the most effective architectures, significantly outperforming other models in classification accuracy. Additionally, the study investigates the impact of feature extraction techniques, dataset augmentation, and model fine-tuning, providing deeper insights into the role of representation learning in audio classification. The findings contribute to the growing field of audio-based deep learning applications, offering a detailed comparative study of model architectures, feature representations, and training strategies for infant cry detection.
format	Article
id	doaj-art-56b2fd401fb245e99ea9cd4fb4a1adcd
institution	OA Journals
issn	2227-9709
language	English
publishDate	2025-05-01
publisher	MDPI AG
record_format	Article
series	Informatics
spelling	doaj-art-56b2fd401fb245e99ea9cd4fb4a1adcd2025-08-20T02:20:58ZengMDPI AGInformatics2227-97092025-05-011225010.3390/informatics12020050A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring SystemDenisa Maria Herlea0Bogdan Iancu1Eugen-Richard Ardelean2Computer Science Department, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, RomaniaComputer Science Department, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, RomaniaComputer Science Department, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, RomaniaThis study investigates the ability of well-known deep learning models, such as ResNet and EfficientNet, to perform audio-based infant cry detection. By comparing the performance of different machine learning algorithms, this study seeks to determine the most effective approach for the detection of infant crying, enhancing the functionality of baby monitoring systems and contributing to a more advanced understanding of audio-based deep learning applications. Understanding and accurately detecting a baby’s cries is crucial for ensuring their safety and well-being, a concern shared by new and expecting parents worldwide. Despite advancements in child health, as noted by UNICEF’s 2022 report of the lowest ever recorded child mortality rate, there is still room for technological improvement. This paper presents a comprehensive evaluation of deep learning models for infant cry detection, analyzing the performance of various architectures on spectrogram and MFCC feature representations. A key focus is the comparison between pretrained and non-pretrained models, assessing their ability to generalize across diverse audio environments. Through extensive experimentation, ResNet50 and DenseNet trained on spectrograms emerged as the most effective architectures, significantly outperforming other models in classification accuracy. Additionally, the study investigates the impact of feature extraction techniques, dataset augmentation, and model fine-tuning, providing deeper insights into the role of representation learning in audio classification. The findings contribute to the growing field of audio-based deep learning applications, offering a detailed comparative study of model architectures, feature representations, and training strategies for infant cry detection.https://www.mdpi.com/2227-9709/12/2/50deep learningconvolutional neural networkclassificationinfant cryingResNetEfficientNet
spellingShingle	Denisa Maria Herlea Bogdan Iancu Eugen-Richard Ardelean A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System Informatics deep learning convolutional neural network classification infant crying ResNet EfficientNet
title	A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System
title_full	A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System
title_fullStr	A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System
title_full_unstemmed	A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System
title_short	A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System
title_sort	study of deep learning models for audio classification of infant crying in a baby monitoring system
topic	deep learning convolutional neural network classification infant crying ResNet EfficientNet
url	https://www.mdpi.com/2227-9709/12/2/50
work_keys_str_mv	AT denisamariaherlea astudyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem AT bogdaniancu astudyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem AT eugenrichardardelean astudyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem AT denisamariaherlea studyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem AT bogdaniancu studyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem AT eugenrichardardelean studyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem

A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System

Similar Items