A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System
This study investigates the ability of well-known deep learning models, such as ResNet and EfficientNet, to perform audio-based infant cry detection. By comparing the performance of different machine learning algorithms, this study seeks to determine the most effective approach for the detection of...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Informatics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-9709/12/2/50 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850168371033669632 |
|---|---|
| author | Denisa Maria Herlea Bogdan Iancu Eugen-Richard Ardelean |
| author_facet | Denisa Maria Herlea Bogdan Iancu Eugen-Richard Ardelean |
| author_sort | Denisa Maria Herlea |
| collection | DOAJ |
| description | This study investigates the ability of well-known deep learning models, such as ResNet and EfficientNet, to perform audio-based infant cry detection. By comparing the performance of different machine learning algorithms, this study seeks to determine the most effective approach for the detection of infant crying, enhancing the functionality of baby monitoring systems and contributing to a more advanced understanding of audio-based deep learning applications. Understanding and accurately detecting a baby’s cries is crucial for ensuring their safety and well-being, a concern shared by new and expecting parents worldwide. Despite advancements in child health, as noted by UNICEF’s 2022 report of the lowest ever recorded child mortality rate, there is still room for technological improvement. This paper presents a comprehensive evaluation of deep learning models for infant cry detection, analyzing the performance of various architectures on spectrogram and MFCC feature representations. A key focus is the comparison between pretrained and non-pretrained models, assessing their ability to generalize across diverse audio environments. Through extensive experimentation, ResNet50 and DenseNet trained on spectrograms emerged as the most effective architectures, significantly outperforming other models in classification accuracy. Additionally, the study investigates the impact of feature extraction techniques, dataset augmentation, and model fine-tuning, providing deeper insights into the role of representation learning in audio classification. The findings contribute to the growing field of audio-based deep learning applications, offering a detailed comparative study of model architectures, feature representations, and training strategies for infant cry detection. |
| format | Article |
| id | doaj-art-56b2fd401fb245e99ea9cd4fb4a1adcd |
| institution | OA Journals |
| issn | 2227-9709 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Informatics |
| spelling | doaj-art-56b2fd401fb245e99ea9cd4fb4a1adcd2025-08-20T02:20:58ZengMDPI AGInformatics2227-97092025-05-011225010.3390/informatics12020050A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring SystemDenisa Maria Herlea0Bogdan Iancu1Eugen-Richard Ardelean2Computer Science Department, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, RomaniaComputer Science Department, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, RomaniaComputer Science Department, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, RomaniaThis study investigates the ability of well-known deep learning models, such as ResNet and EfficientNet, to perform audio-based infant cry detection. By comparing the performance of different machine learning algorithms, this study seeks to determine the most effective approach for the detection of infant crying, enhancing the functionality of baby monitoring systems and contributing to a more advanced understanding of audio-based deep learning applications. Understanding and accurately detecting a baby’s cries is crucial for ensuring their safety and well-being, a concern shared by new and expecting parents worldwide. Despite advancements in child health, as noted by UNICEF’s 2022 report of the lowest ever recorded child mortality rate, there is still room for technological improvement. This paper presents a comprehensive evaluation of deep learning models for infant cry detection, analyzing the performance of various architectures on spectrogram and MFCC feature representations. A key focus is the comparison between pretrained and non-pretrained models, assessing their ability to generalize across diverse audio environments. Through extensive experimentation, ResNet50 and DenseNet trained on spectrograms emerged as the most effective architectures, significantly outperforming other models in classification accuracy. Additionally, the study investigates the impact of feature extraction techniques, dataset augmentation, and model fine-tuning, providing deeper insights into the role of representation learning in audio classification. The findings contribute to the growing field of audio-based deep learning applications, offering a detailed comparative study of model architectures, feature representations, and training strategies for infant cry detection.https://www.mdpi.com/2227-9709/12/2/50deep learningconvolutional neural networkclassificationinfant cryingResNetEfficientNet |
| spellingShingle | Denisa Maria Herlea Bogdan Iancu Eugen-Richard Ardelean A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System Informatics deep learning convolutional neural network classification infant crying ResNet EfficientNet |
| title | A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System |
| title_full | A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System |
| title_fullStr | A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System |
| title_full_unstemmed | A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System |
| title_short | A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System |
| title_sort | study of deep learning models for audio classification of infant crying in a baby monitoring system |
| topic | deep learning convolutional neural network classification infant crying ResNet EfficientNet |
| url | https://www.mdpi.com/2227-9709/12/2/50 |
| work_keys_str_mv | AT denisamariaherlea astudyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem AT bogdaniancu astudyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem AT eugenrichardardelean astudyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem AT denisamariaherlea studyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem AT bogdaniancu studyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem AT eugenrichardardelean studyofdeeplearningmodelsforaudioclassificationofinfantcryinginababymonitoringsystem |