Prediction of COVID-19 cases by multifactor driven long short-term memory (LSTM) model

Abstract Since December 2019, cases of COVID-19 have spread globally, caused millions of deaths and huge economic losses. To investigate the impact of different factors and predict the future trend, this study collects relevant data for 15 countries, containing 44 features in about 900 days, which c...

Full description

Saved in:
Bibliographic Details
Main Authors: Yanwen Shao, Tsz Kin Wan, Kei Hang Katie Chan
Format: Article
Language:English
Published: Nature Portfolio 2025-02-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-86698-1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850197207860379648
author Yanwen Shao
Tsz Kin Wan
Kei Hang Katie Chan
author_facet Yanwen Shao
Tsz Kin Wan
Kei Hang Katie Chan
author_sort Yanwen Shao
collection DOAJ
description Abstract Since December 2019, cases of COVID-19 have spread globally, caused millions of deaths and huge economic losses. To investigate the impact of different factors and predict the future trend, this study collects relevant data for 15 countries, containing 44 features in about 900 days, which can be classified into four groups: pandemic information, the characteristics of countries, climate, and prevention policies. Through the selection of several important features, we identified the factors that have stronger impact on the increase of new cases in different groups. Then, we use a long-time span data to predict the future COVID-19 new cases by training a long short-term memory (LSTM) model, a support vector regressor (SVR) and a temporal convolutional network (TCN), among which LSTM possessed the best performance and offered a good generalization ability. Under the metric of explained variance scores (EVS), the prediction performances were the most accurate for Germany (0.864), Italy (0.860) and the United States (0.766). Overall, the results of this study may provide insight for predictions of number of COVID-19 new cases in more countries/regions and offer some insightful recommendation for governments to carry out more effective policies to prevent COVID-19.
format Article
id doaj-art-75a1dc88e65d4b29840cbeef08f912e2
institution OA Journals
issn 2045-2322
language English
publishDate 2025-02-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-75a1dc88e65d4b29840cbeef08f912e22025-08-20T02:13:14ZengNature PortfolioScientific Reports2045-23222025-02-0115111510.1038/s41598-025-86698-1Prediction of COVID-19 cases by multifactor driven long short-term memory (LSTM) modelYanwen Shao0Tsz Kin Wan1Kei Hang Katie Chan2Department of Biomedical Sciences, City University of Hong KongDepartment of Electrical Engineering, City University of Hong KongDepartment of Biomedical Sciences, City University of Hong KongAbstract Since December 2019, cases of COVID-19 have spread globally, caused millions of deaths and huge economic losses. To investigate the impact of different factors and predict the future trend, this study collects relevant data for 15 countries, containing 44 features in about 900 days, which can be classified into four groups: pandemic information, the characteristics of countries, climate, and prevention policies. Through the selection of several important features, we identified the factors that have stronger impact on the increase of new cases in different groups. Then, we use a long-time span data to predict the future COVID-19 new cases by training a long short-term memory (LSTM) model, a support vector regressor (SVR) and a temporal convolutional network (TCN), among which LSTM possessed the best performance and offered a good generalization ability. Under the metric of explained variance scores (EVS), the prediction performances were the most accurate for Germany (0.864), Italy (0.860) and the United States (0.766). Overall, the results of this study may provide insight for predictions of number of COVID-19 new cases in more countries/regions and offer some insightful recommendation for governments to carry out more effective policies to prevent COVID-19.https://doi.org/10.1038/s41598-025-86698-1COVID-19Disease predictionPandemic prevention policyLSTMMachine learning
spellingShingle Yanwen Shao
Tsz Kin Wan
Kei Hang Katie Chan
Prediction of COVID-19 cases by multifactor driven long short-term memory (LSTM) model
Scientific Reports
COVID-19
Disease prediction
Pandemic prevention policy
LSTM
Machine learning
title Prediction of COVID-19 cases by multifactor driven long short-term memory (LSTM) model
title_full Prediction of COVID-19 cases by multifactor driven long short-term memory (LSTM) model
title_fullStr Prediction of COVID-19 cases by multifactor driven long short-term memory (LSTM) model
title_full_unstemmed Prediction of COVID-19 cases by multifactor driven long short-term memory (LSTM) model
title_short Prediction of COVID-19 cases by multifactor driven long short-term memory (LSTM) model
title_sort prediction of covid 19 cases by multifactor driven long short term memory lstm model
topic COVID-19
Disease prediction
Pandemic prevention policy
LSTM
Machine learning
url https://doi.org/10.1038/s41598-025-86698-1
work_keys_str_mv AT yanwenshao predictionofcovid19casesbymultifactordrivenlongshorttermmemorylstmmodel
AT tszkinwan predictionofcovid19casesbymultifactordrivenlongshorttermmemorylstmmodel
AT keihangkatiechan predictionofcovid19casesbymultifactordrivenlongshorttermmemorylstmmodel