Denoising of Heart Sounds Using Lightweight FCNs and Spectrograms With and Without Context
Cardiac auscultation using a digital stethoscope is an important method for diagnosis of cardiovascular diseases (CVDs). However, heart sound recordings are often contaminated with adventitious noise, especially in crowded, noisy settings such as resource-constrained hospitals. This noise can confou...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10981720/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Cardiac auscultation using a digital stethoscope is an important method for diagnosis of cardiovascular diseases (CVDs). However, heart sound recordings are often contaminated with adventitious noise, especially in crowded, noisy settings such as resource-constrained hospitals. This noise can confound accurate diagnosis of heart pathologies. We propose a method for denoising heart sounds using fully convolutional networks (FCNs) based on the Spleeter U-Net architecture. We first generate a spectrogram of the heart sound recording and then use FCNs to semantically segment this into noise and signal components. We present an adaptation of the full Spleeter design, and also a lighter version operating on smaller spectrograms. This is aimed at reducing latency in a future real-time implementation of this scheme. We investigate whether providing this latter network with context improves the performance. We evaluate the denoising performance by artificially contaminating clean heart sounds with real-world noise (additive white Gaussian noise (AWGN), ambient hospital noise, lung sounds, and speech). Our best model was the lighter model with context, which we call the denoiser with context (DWC). We tested all models with different contamination types at different signal-to-noise ratios (SNRs), and found that the DWC gave an overall average improvement of 10.322 dB, with average increases ranging from 6.151 dB to 14.479 dB. We also implement the denoising inference on an edge device to show the feasibility of running this scheme on an embedded system. This work is a step towards a real-time deep learning-based denoiser for use with a digital stethoscope. |
|---|---|
| ISSN: | 2169-3536 |