A progressive attention-based cross-modal fusion network for cardiovascular disease detection using synchronized electrocardiogram and phonocardiogram signals

Synchronized electrocardiogram (ECG) and phonocardiogram (PCG) signals provide complementary diagnostic insights crucial for improving the accuracy of cardiovascular disease (CVD) detection. However, existing deep learning methods often utilize single-modal data or employ simplistic early or late fu...

Full description

Saved in:
Bibliographic Details
Main Authors: Wei Peng Li, Joon Huang Chuah, Guo Jeng Tan, Chengyu Liu, Hua-Nong Ting
Format: Article
Language:English
Published: PeerJ Inc. 2025-07-01
Series:PeerJ Computer Science
Subjects:
Online Access:https://peerj.com/articles/cs-3038.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850066676773552128
author Wei Peng Li
Joon Huang Chuah
Guo Jeng Tan
Chengyu Liu
Hua-Nong Ting
author_facet Wei Peng Li
Joon Huang Chuah
Guo Jeng Tan
Chengyu Liu
Hua-Nong Ting
author_sort Wei Peng Li
collection DOAJ
description Synchronized electrocardiogram (ECG) and phonocardiogram (PCG) signals provide complementary diagnostic insights crucial for improving the accuracy of cardiovascular disease (CVD) detection. However, existing deep learning methods often utilize single-modal data or employ simplistic early or late fusion strategies, which inadequately capture the complex, hierarchical interdependencies between these modalities, thereby limiting detection performance. This study introduces PACFNet, a novel progressive attention-based cross-modal feature fusion network, for end-to-end CVD detection. PACFNet features a three-branch architecture: two modality-specific encoders for ECG and PCG, and a progressive selective attention-based cross-modal fusion encoder. A key innovation is its four-layer progressive fusion mechanism, which integrates multi-modal information from low-level morphological details to high-level semantic representations. This is achieved by selective attention-based cross-modal fusion (SACMF) modules at each progressive level, employing cascaded spatial and channel attention to dynamically emphasize salient feature contributions across modalities, thus significantly enhancing feature learning. Signals are pre-processed using a beat-to-beat segmentation approach to analyze individual cardiac cycles. Experimental validation on the public PhysioNet 2016 dataset demonstrates PACFNet’s state-of-the-art performance, with an accuracy of 97.7%, sensitivity of 98%, specificity of 97.3%, and an F1-score of 99.7%. Notably, PACFNet not only excels in multi-modal settings but also maintains robust diagnostic capabilities even with missing modalities, underscoring its practical effectiveness and reliability. The source code is publicly available on Zenodo (https://zenodo.org/records/15450169).
format Article
id doaj-art-408bbdc5cb8946698cfa21014ed22a7e
institution DOAJ
issn 2376-5992
language English
publishDate 2025-07-01
publisher PeerJ Inc.
record_format Article
series PeerJ Computer Science
spelling doaj-art-408bbdc5cb8946698cfa21014ed22a7e2025-08-20T02:48:39ZengPeerJ Inc.PeerJ Computer Science2376-59922025-07-0111e303810.7717/peerj-cs.3038A progressive attention-based cross-modal fusion network for cardiovascular disease detection using synchronized electrocardiogram and phonocardiogram signalsWei Peng Li0Joon Huang Chuah1Guo Jeng Tan2Chengyu Liu3Hua-Nong Ting4Department of Electrical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur, Kuala Lumpur, MalaysiaDepartment of Electrical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur, Kuala Lumpur, MalaysiaDepartment of Medicine, Faculty of Medicine, Universiti Malaya, Kuala Lumpur, Kuala Lumpur, MalaysiaSchool of Instrument Science and Engineering, Southeast University, Nanjing, JiangSu, ChinaDepartment of Biomedical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur, Kuala Lumpur, MalaysiaSynchronized electrocardiogram (ECG) and phonocardiogram (PCG) signals provide complementary diagnostic insights crucial for improving the accuracy of cardiovascular disease (CVD) detection. However, existing deep learning methods often utilize single-modal data or employ simplistic early or late fusion strategies, which inadequately capture the complex, hierarchical interdependencies between these modalities, thereby limiting detection performance. This study introduces PACFNet, a novel progressive attention-based cross-modal feature fusion network, for end-to-end CVD detection. PACFNet features a three-branch architecture: two modality-specific encoders for ECG and PCG, and a progressive selective attention-based cross-modal fusion encoder. A key innovation is its four-layer progressive fusion mechanism, which integrates multi-modal information from low-level morphological details to high-level semantic representations. This is achieved by selective attention-based cross-modal fusion (SACMF) modules at each progressive level, employing cascaded spatial and channel attention to dynamically emphasize salient feature contributions across modalities, thus significantly enhancing feature learning. Signals are pre-processed using a beat-to-beat segmentation approach to analyze individual cardiac cycles. Experimental validation on the public PhysioNet 2016 dataset demonstrates PACFNet’s state-of-the-art performance, with an accuracy of 97.7%, sensitivity of 98%, specificity of 97.3%, and an F1-score of 99.7%. Notably, PACFNet not only excels in multi-modal settings but also maintains robust diagnostic capabilities even with missing modalities, underscoring its practical effectiveness and reliability. The source code is publicly available on Zenodo (https://zenodo.org/records/15450169).https://peerj.com/articles/cs-3038.pdfElectrocardiogram (ECG)Phonocardiogram (PCG)Multi-modalitySpatial attentionChannel attention
spellingShingle Wei Peng Li
Joon Huang Chuah
Guo Jeng Tan
Chengyu Liu
Hua-Nong Ting
A progressive attention-based cross-modal fusion network for cardiovascular disease detection using synchronized electrocardiogram and phonocardiogram signals
PeerJ Computer Science
Electrocardiogram (ECG)
Phonocardiogram (PCG)
Multi-modality
Spatial attention
Channel attention
title A progressive attention-based cross-modal fusion network for cardiovascular disease detection using synchronized electrocardiogram and phonocardiogram signals
title_full A progressive attention-based cross-modal fusion network for cardiovascular disease detection using synchronized electrocardiogram and phonocardiogram signals
title_fullStr A progressive attention-based cross-modal fusion network for cardiovascular disease detection using synchronized electrocardiogram and phonocardiogram signals
title_full_unstemmed A progressive attention-based cross-modal fusion network for cardiovascular disease detection using synchronized electrocardiogram and phonocardiogram signals
title_short A progressive attention-based cross-modal fusion network for cardiovascular disease detection using synchronized electrocardiogram and phonocardiogram signals
title_sort progressive attention based cross modal fusion network for cardiovascular disease detection using synchronized electrocardiogram and phonocardiogram signals
topic Electrocardiogram (ECG)
Phonocardiogram (PCG)
Multi-modality
Spatial attention
Channel attention
url https://peerj.com/articles/cs-3038.pdf
work_keys_str_mv AT weipengli aprogressiveattentionbasedcrossmodalfusionnetworkforcardiovasculardiseasedetectionusingsynchronizedelectrocardiogramandphonocardiogramsignals
AT joonhuangchuah aprogressiveattentionbasedcrossmodalfusionnetworkforcardiovasculardiseasedetectionusingsynchronizedelectrocardiogramandphonocardiogramsignals
AT guojengtan aprogressiveattentionbasedcrossmodalfusionnetworkforcardiovasculardiseasedetectionusingsynchronizedelectrocardiogramandphonocardiogramsignals
AT chengyuliu aprogressiveattentionbasedcrossmodalfusionnetworkforcardiovasculardiseasedetectionusingsynchronizedelectrocardiogramandphonocardiogramsignals
AT huanongting aprogressiveattentionbasedcrossmodalfusionnetworkforcardiovasculardiseasedetectionusingsynchronizedelectrocardiogramandphonocardiogramsignals
AT weipengli progressiveattentionbasedcrossmodalfusionnetworkforcardiovasculardiseasedetectionusingsynchronizedelectrocardiogramandphonocardiogramsignals
AT joonhuangchuah progressiveattentionbasedcrossmodalfusionnetworkforcardiovasculardiseasedetectionusingsynchronizedelectrocardiogramandphonocardiogramsignals
AT guojengtan progressiveattentionbasedcrossmodalfusionnetworkforcardiovasculardiseasedetectionusingsynchronizedelectrocardiogramandphonocardiogramsignals
AT chengyuliu progressiveattentionbasedcrossmodalfusionnetworkforcardiovasculardiseasedetectionusingsynchronizedelectrocardiogramandphonocardiogramsignals
AT huanongting progressiveattentionbasedcrossmodalfusionnetworkforcardiovasculardiseasedetectionusingsynchronizedelectrocardiogramandphonocardiogramsignals