Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography Video
Heart failure is a disease many consider to be the number one global cause of death. Despite its mortality, heart failure is still underdiagnosed clinically, especially in a remote area that experiences cardiologists shortage. Existing studies have employed artificial intelligence to help with heart...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2024-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10776969/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850066017247559680 |
|---|---|
| author | Mgs M. Luthfi Ramadhan Adyatma W. A. Nugraha Yudha Muhammad Febrian Rachmadi Kevin Moses Hanky Jr Tandayu Lies Dina Liastuti Wisnu Jatmiko |
| author_facet | Mgs M. Luthfi Ramadhan Adyatma W. A. Nugraha Yudha Muhammad Febrian Rachmadi Kevin Moses Hanky Jr Tandayu Lies Dina Liastuti Wisnu Jatmiko |
| author_sort | Mgs M. Luthfi Ramadhan |
| collection | DOAJ |
| description | Heart failure is a disease many consider to be the number one global cause of death. Despite its mortality, heart failure is still underdiagnosed clinically, especially in a remote area that experiences cardiologists shortage. Existing studies have employed artificial intelligence to help with heart failure screening and diagnosis processes based on echocardiography videos. Specifically, most existing studies use a convolutional neural network that only captures the local context of an image hindering it from learning the global context of an image. Moreover, the frame sampling algorithms only sample certain consecutive frames which makes it questionable whether the dynamic of the left ventricle during a cardiac cycle is included. This study proposed a novel deep learning model consisting of a time-distributed vision transformer stacked with a transformer. The time-distributed vision transformer learns the spatial feature and then feeds the result to the transformer to learn the temporal feature and make the final prediction afterward. We also proposed a frame sampling algorithm by squeezing the video and sampling the frame after a certain interval. Consequently, the video still contains the sequential information up until the end of the video with some in-between frames removed by a certain interval. Thus, the dynamic of the left ventricle is preserved. Our proposed method achieved an F1 score of 95.81%, 96.19%, and 93.43% for the apical four chamber view, apical two chamber view, and parasternal long axis view respectively. The overall trustworthiness of our model is quantified using the NetTrustScore and achieved a score of 0.9712, 0.9767, and 0.9527 for the apical four chamber view, apical two chamber view, and parasternal long axis view respectively. |
| format | Article |
| id | doaj-art-041fa5fdc6a5430ba018f13bc68e2c8b |
| institution | DOAJ |
| issn | 2169-3536 |
| language | English |
| publishDate | 2024-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-041fa5fdc6a5430ba018f13bc68e2c8b2025-08-20T02:48:51ZengIEEEIEEE Access2169-35362024-01-011218243818245410.1109/ACCESS.2024.351077410776969Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography VideoMgs M. Luthfi Ramadhan0https://orcid.org/0000-0001-8571-8924Adyatma W. A. Nugraha Yudha1https://orcid.org/0009-0006-0163-076XMuhammad Febrian Rachmadi2https://orcid.org/0000-0003-1672-9149Kevin Moses Hanky Jr Tandayu3https://orcid.org/0009-0002-1303-881XLies Dina Liastuti4https://orcid.org/0000-0002-0489-3665Wisnu Jatmiko5https://orcid.org/0000-0002-0530-7955Faculty of Computer Science, University of Indonesia, Depok City, IndonesiaFaculty of Computer Science, University of Indonesia, Depok City, IndonesiaFaculty of Computer Science, University of Indonesia, Depok City, IndonesiaDepartment of Cardiology and Vascular Medicine, Faculty of Medicine, Universitas Indonesia, Depok City, IndonesiaDepartment of Cardiology and Vascular Medicine, Faculty of Medicine, Universitas Indonesia, Depok City, IndonesiaFaculty of Computer Science, University of Indonesia, Depok City, IndonesiaHeart failure is a disease many consider to be the number one global cause of death. Despite its mortality, heart failure is still underdiagnosed clinically, especially in a remote area that experiences cardiologists shortage. Existing studies have employed artificial intelligence to help with heart failure screening and diagnosis processes based on echocardiography videos. Specifically, most existing studies use a convolutional neural network that only captures the local context of an image hindering it from learning the global context of an image. Moreover, the frame sampling algorithms only sample certain consecutive frames which makes it questionable whether the dynamic of the left ventricle during a cardiac cycle is included. This study proposed a novel deep learning model consisting of a time-distributed vision transformer stacked with a transformer. The time-distributed vision transformer learns the spatial feature and then feeds the result to the transformer to learn the temporal feature and make the final prediction afterward. We also proposed a frame sampling algorithm by squeezing the video and sampling the frame after a certain interval. Consequently, the video still contains the sequential information up until the end of the video with some in-between frames removed by a certain interval. Thus, the dynamic of the left ventricle is preserved. Our proposed method achieved an F1 score of 95.81%, 96.19%, and 93.43% for the apical four chamber view, apical two chamber view, and parasternal long axis view respectively. The overall trustworthiness of our model is quantified using the NetTrustScore and achieved a score of 0.9712, 0.9767, and 0.9527 for the apical four chamber view, apical two chamber view, and parasternal long axis view respectively.https://ieeexplore.ieee.org/document/10776969/Deep learningpattern recognitionheart failureechocardiographycomputer vision |
| spellingShingle | Mgs M. Luthfi Ramadhan Adyatma W. A. Nugraha Yudha Muhammad Febrian Rachmadi Kevin Moses Hanky Jr Tandayu Lies Dina Liastuti Wisnu Jatmiko Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography Video IEEE Access Deep learning pattern recognition heart failure echocardiography computer vision |
| title | Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography Video |
| title_full | Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography Video |
| title_fullStr | Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography Video |
| title_full_unstemmed | Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography Video |
| title_short | Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography Video |
| title_sort | time distributed vision transformer stacked with transformer for heart failure detection based on echocardiography video |
| topic | Deep learning pattern recognition heart failure echocardiography computer vision |
| url | https://ieeexplore.ieee.org/document/10776969/ |
| work_keys_str_mv | AT mgsmluthfiramadhan timedistributedvisiontransformerstackedwithtransformerforheartfailuredetectionbasedonechocardiographyvideo AT adyatmawanugrahayudha timedistributedvisiontransformerstackedwithtransformerforheartfailuredetectionbasedonechocardiographyvideo AT muhammadfebrianrachmadi timedistributedvisiontransformerstackedwithtransformerforheartfailuredetectionbasedonechocardiographyvideo AT kevinmoseshankyjrtandayu timedistributedvisiontransformerstackedwithtransformerforheartfailuredetectionbasedonechocardiographyvideo AT liesdinaliastuti timedistributedvisiontransformerstackedwithtransformerforheartfailuredetectionbasedonechocardiographyvideo AT wisnujatmiko timedistributedvisiontransformerstackedwithtransformerforheartfailuredetectionbasedonechocardiographyvideo |