Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography Video

Heart failure is a disease many consider to be the number one global cause of death. Despite its mortality, heart failure is still underdiagnosed clinically, especially in a remote area that experiences cardiologists shortage. Existing studies have employed artificial intelligence to help with heart...

Full description

Saved in:

Bibliographic Details
Main Authors:	Mgs M. Luthfi Ramadhan, Adyatma W. A. Nugraha Yudha, Muhammad Febrian Rachmadi, Kevin Moses Hanky Jr Tandayu, Lies Dina Liastuti, Wisnu Jatmiko
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	Deep learning pattern recognition heart failure echocardiography computer vision
Online Access:	https://ieeexplore.ieee.org/document/10776969/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850066017247559680
author	Mgs M. Luthfi Ramadhan Adyatma W. A. Nugraha Yudha Muhammad Febrian Rachmadi Kevin Moses Hanky Jr Tandayu Lies Dina Liastuti Wisnu Jatmiko
author_facet	Mgs M. Luthfi Ramadhan Adyatma W. A. Nugraha Yudha Muhammad Febrian Rachmadi Kevin Moses Hanky Jr Tandayu Lies Dina Liastuti Wisnu Jatmiko
author_sort	Mgs M. Luthfi Ramadhan
collection	DOAJ
description	Heart failure is a disease many consider to be the number one global cause of death. Despite its mortality, heart failure is still underdiagnosed clinically, especially in a remote area that experiences cardiologists shortage. Existing studies have employed artificial intelligence to help with heart failure screening and diagnosis processes based on echocardiography videos. Specifically, most existing studies use a convolutional neural network that only captures the local context of an image hindering it from learning the global context of an image. Moreover, the frame sampling algorithms only sample certain consecutive frames which makes it questionable whether the dynamic of the left ventricle during a cardiac cycle is included. This study proposed a novel deep learning model consisting of a time-distributed vision transformer stacked with a transformer. The time-distributed vision transformer learns the spatial feature and then feeds the result to the transformer to learn the temporal feature and make the final prediction afterward. We also proposed a frame sampling algorithm by squeezing the video and sampling the frame after a certain interval. Consequently, the video still contains the sequential information up until the end of the video with some in-between frames removed by a certain interval. Thus, the dynamic of the left ventricle is preserved. Our proposed method achieved an F1 score of 95.81%, 96.19%, and 93.43% for the apical four chamber view, apical two chamber view, and parasternal long axis view respectively. The overall trustworthiness of our model is quantified using the NetTrustScore and achieved a score of 0.9712, 0.9767, and 0.9527 for the apical four chamber view, apical two chamber view, and parasternal long axis view respectively.
format	Article
id	doaj-art-041fa5fdc6a5430ba018f13bc68e2c8b
institution	DOAJ
issn	2169-3536
language	English
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-041fa5fdc6a5430ba018f13bc68e2c8b2025-08-20T02:48:51ZengIEEEIEEE Access2169-35362024-01-011218243818245410.1109/ACCESS.2024.351077410776969Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography VideoMgs M. Luthfi Ramadhan0https://orcid.org/0000-0001-8571-8924Adyatma W. A. Nugraha Yudha1https://orcid.org/0009-0006-0163-076XMuhammad Febrian Rachmadi2https://orcid.org/0000-0003-1672-9149Kevin Moses Hanky Jr Tandayu3https://orcid.org/0009-0002-1303-881XLies Dina Liastuti4https://orcid.org/0000-0002-0489-3665Wisnu Jatmiko5https://orcid.org/0000-0002-0530-7955Faculty of Computer Science, University of Indonesia, Depok City, IndonesiaFaculty of Computer Science, University of Indonesia, Depok City, IndonesiaFaculty of Computer Science, University of Indonesia, Depok City, IndonesiaDepartment of Cardiology and Vascular Medicine, Faculty of Medicine, Universitas Indonesia, Depok City, IndonesiaDepartment of Cardiology and Vascular Medicine, Faculty of Medicine, Universitas Indonesia, Depok City, IndonesiaFaculty of Computer Science, University of Indonesia, Depok City, IndonesiaHeart failure is a disease many consider to be the number one global cause of death. Despite its mortality, heart failure is still underdiagnosed clinically, especially in a remote area that experiences cardiologists shortage. Existing studies have employed artificial intelligence to help with heart failure screening and diagnosis processes based on echocardiography videos. Specifically, most existing studies use a convolutional neural network that only captures the local context of an image hindering it from learning the global context of an image. Moreover, the frame sampling algorithms only sample certain consecutive frames which makes it questionable whether the dynamic of the left ventricle during a cardiac cycle is included. This study proposed a novel deep learning model consisting of a time-distributed vision transformer stacked with a transformer. The time-distributed vision transformer learns the spatial feature and then feeds the result to the transformer to learn the temporal feature and make the final prediction afterward. We also proposed a frame sampling algorithm by squeezing the video and sampling the frame after a certain interval. Consequently, the video still contains the sequential information up until the end of the video with some in-between frames removed by a certain interval. Thus, the dynamic of the left ventricle is preserved. Our proposed method achieved an F1 score of 95.81%, 96.19%, and 93.43% for the apical four chamber view, apical two chamber view, and parasternal long axis view respectively. The overall trustworthiness of our model is quantified using the NetTrustScore and achieved a score of 0.9712, 0.9767, and 0.9527 for the apical four chamber view, apical two chamber view, and parasternal long axis view respectively.https://ieeexplore.ieee.org/document/10776969/Deep learningpattern recognitionheart failureechocardiographycomputer vision
spellingShingle	Mgs M. Luthfi Ramadhan Adyatma W. A. Nugraha Yudha Muhammad Febrian Rachmadi Kevin Moses Hanky Jr Tandayu Lies Dina Liastuti Wisnu Jatmiko Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography Video IEEE Access Deep learning pattern recognition heart failure echocardiography computer vision
title	Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography Video
title_full	Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography Video
title_fullStr	Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography Video
title_full_unstemmed	Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography Video
title_short	Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography Video
title_sort	time distributed vision transformer stacked with transformer for heart failure detection based on echocardiography video
topic	Deep learning pattern recognition heart failure echocardiography computer vision
url	https://ieeexplore.ieee.org/document/10776969/
work_keys_str_mv	AT mgsmluthfiramadhan timedistributedvisiontransformerstackedwithtransformerforheartfailuredetectionbasedonechocardiographyvideo AT adyatmawanugrahayudha timedistributedvisiontransformerstackedwithtransformerforheartfailuredetectionbasedonechocardiographyvideo AT muhammadfebrianrachmadi timedistributedvisiontransformerstackedwithtransformerforheartfailuredetectionbasedonechocardiographyvideo AT kevinmoseshankyjrtandayu timedistributedvisiontransformerstackedwithtransformerforheartfailuredetectionbasedonechocardiographyvideo AT liesdinaliastuti timedistributedvisiontransformerstackedwithtransformerforheartfailuredetectionbasedonechocardiographyvideo AT wisnujatmiko timedistributedvisiontransformerstackedwithtransformerforheartfailuredetectionbasedonechocardiographyvideo

Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography Video

Similar Items