Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distances

IntroductionSevere acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of the global coronavirus disease 2019 (COVID-19) pandemic and continues to drive successive waves of infection through the emergence of novel variants. Consequently, accurately predicting the next clade...

Full description

Saved in:
Bibliographic Details
Main Authors: Kyuyoung Lee, Atanas V. Demirev, Sangyi Lee, Seunghye Cho, Hyunbeen Kim, Junhyung Cho, Jeong-Sun Yang, Kyung-Chang Kim, Joo-Yeon Lee, Woojin Shin, Soyoung Lee, Sejik Park, Philippe Lemey, Man-Seong Park, Jin Il Kim
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-06-01
Series:Frontiers in Microbiology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fmicb.2025.1619546/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850212199900905472
author Kyuyoung Lee
Atanas V. Demirev
Sangyi Lee
Seunghye Cho
Hyunbeen Kim
Junhyung Cho
Jeong-Sun Yang
Kyung-Chang Kim
Joo-Yeon Lee
Woojin Shin
Soyoung Lee
Sejik Park
Philippe Lemey
Man-Seong Park
Man-Seong Park
Man-Seong Park
Jin Il Kim
Jin Il Kim
Jin Il Kim
author_facet Kyuyoung Lee
Atanas V. Demirev
Sangyi Lee
Seunghye Cho
Hyunbeen Kim
Junhyung Cho
Jeong-Sun Yang
Kyung-Chang Kim
Joo-Yeon Lee
Woojin Shin
Soyoung Lee
Sejik Park
Philippe Lemey
Man-Seong Park
Man-Seong Park
Man-Seong Park
Jin Il Kim
Jin Il Kim
Jin Il Kim
author_sort Kyuyoung Lee
collection DOAJ
description IntroductionSevere acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of the global coronavirus disease 2019 (COVID-19) pandemic and continues to drive successive waves of infection through the emergence of novel variants. Consequently, accurately predicting the next clade roots through global surveillance is crucial for effective prevention, control, and timely updates of vaccine antigen updates. This study evaluated the evolutionary dynamics of SARS-CoV-2 using phylogeny-informed genetic distances based on 394 complete genomes and spike (S) gene sequences. Furthermore, we introduced a forecasting framework to estimate the potential of emerging variants leading to clade replacement by analyzing non-synonymous and synonymous genetic distances from clade roots, which reflect global herd immune pressure.MethodsNon-synonymous and synonymous genetic distances from both Wuhan and clade root strains were assessed to predict whether a clade would become dominant or extinct within 3 months before the clade replacement.ResultsThrough five observed clade replacements up to January 2024, we captured the quantifiable heterogeneity in non-synonymous and synonymous genetic distances of the S gene from clade roots between dominant and extinct variants, as measured by the extent of novelty, whether through gradual or drastic change.DiscussionOur framework demonstrated high predictability for identifying the next clade root before replacement in both training and test datasets (area under the receiver operating characteristic curve [AUROC] > 0.90) by incorporating differential weighting of non-synonymous and synonymous genetic distances. Additionally, the framework solely using spike gene data demonstrated similar accuracy to those using the complete genome. Overall, our approach establishes quantifiable molecular criteria for identifying potential updates to the SARS-CoV-2 vaccine, contributing to proactive pandemic preparedness.
format Article
id doaj-art-050dea4769ac4dbc9eba915da4494b97
institution OA Journals
issn 1664-302X
language English
publishDate 2025-06-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Microbiology
spelling doaj-art-050dea4769ac4dbc9eba915da4494b972025-08-20T02:09:24ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2025-06-011610.3389/fmicb.2025.16195461619546Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distancesKyuyoung Lee0Atanas V. Demirev1Sangyi Lee2Seunghye Cho3Hyunbeen Kim4Junhyung Cho5Jeong-Sun Yang6Kyung-Chang Kim7Joo-Yeon Lee8Woojin Shin9Soyoung Lee10Sejik Park11Philippe Lemey12Man-Seong Park13Man-Seong Park14Man-Seong Park15Jin Il Kim16Jin Il Kim17Jin Il Kim18Department of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaDivision of Emerging Viral Diseases and Vector Research, Center for Infectious Diseases Research, National Institute of Infectious Diseases, Korea National Institute of Health, Osong, Republic of KoreaDivision of Emerging Viral Diseases and Vector Research, Center for Infectious Diseases Research, National Institute of Infectious Diseases, Korea National Institute of Health, Osong, Republic of KoreaDivision of Emerging Viral Diseases and Vector Research, Center for Infectious Diseases Research, National Institute of Infectious Diseases, Korea National Institute of Health, Osong, Republic of KoreaCenter for Infectious Diseases Research, National Institute of Infectious Diseases, Korea National Institute of Health, Osong, Republic of KoreaDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaDepartment of Microbiology, Immunology, and Transplantation, Rega Institute, KU Leuven, Leuven, BelgiumDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaVaccine Innovation Center, Korea University College of Medicine, Seoul, Republic of KoreaBiosafety Center, Korea University College of Medicine, Seoul, Republic of KoreaDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaVaccine Innovation Center, Korea University College of Medicine, Seoul, Republic of KoreaBiosafety Center, Korea University College of Medicine, Seoul, Republic of KoreaIntroductionSevere acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of the global coronavirus disease 2019 (COVID-19) pandemic and continues to drive successive waves of infection through the emergence of novel variants. Consequently, accurately predicting the next clade roots through global surveillance is crucial for effective prevention, control, and timely updates of vaccine antigen updates. This study evaluated the evolutionary dynamics of SARS-CoV-2 using phylogeny-informed genetic distances based on 394 complete genomes and spike (S) gene sequences. Furthermore, we introduced a forecasting framework to estimate the potential of emerging variants leading to clade replacement by analyzing non-synonymous and synonymous genetic distances from clade roots, which reflect global herd immune pressure.MethodsNon-synonymous and synonymous genetic distances from both Wuhan and clade root strains were assessed to predict whether a clade would become dominant or extinct within 3 months before the clade replacement.ResultsThrough five observed clade replacements up to January 2024, we captured the quantifiable heterogeneity in non-synonymous and synonymous genetic distances of the S gene from clade roots between dominant and extinct variants, as measured by the extent of novelty, whether through gradual or drastic change.DiscussionOur framework demonstrated high predictability for identifying the next clade root before replacement in both training and test datasets (area under the receiver operating characteristic curve [AUROC] > 0.90) by incorporating differential weighting of non-synonymous and synonymous genetic distances. Additionally, the framework solely using spike gene data demonstrated similar accuracy to those using the complete genome. Overall, our approach establishes quantifiable molecular criteria for identifying potential updates to the SARS-CoV-2 vaccine, contributing to proactive pandemic preparedness.https://www.frontiersin.org/articles/10.3389/fmicb.2025.1619546/fullSARS-CoV-2evolutionclade replacementforecasting frameworkspike genedominance
spellingShingle Kyuyoung Lee
Atanas V. Demirev
Sangyi Lee
Seunghye Cho
Hyunbeen Kim
Junhyung Cho
Jeong-Sun Yang
Kyung-Chang Kim
Joo-Yeon Lee
Woojin Shin
Soyoung Lee
Sejik Park
Philippe Lemey
Man-Seong Park
Man-Seong Park
Man-Seong Park
Jin Il Kim
Jin Il Kim
Jin Il Kim
Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distances
Frontiers in Microbiology
SARS-CoV-2
evolution
clade replacement
forecasting framework
spike gene
dominance
title Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distances
title_full Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distances
title_fullStr Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distances
title_full_unstemmed Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distances
title_short Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distances
title_sort forecasting framework for dominant sars cov 2 strains before clade replacement using phylogeny informed genetic distances
topic SARS-CoV-2
evolution
clade replacement
forecasting framework
spike gene
dominance
url https://www.frontiersin.org/articles/10.3389/fmicb.2025.1619546/full
work_keys_str_mv AT kyuyounglee forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT atanasvdemirev forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT sangyilee forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT seunghyecho forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT hyunbeenkim forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT junhyungcho forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT jeongsunyang forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT kyungchangkim forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT jooyeonlee forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT woojinshin forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT soyounglee forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT sejikpark forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT philippelemey forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT manseongpark forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT manseongpark forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT manseongpark forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT jinilkim forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT jinilkim forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances
AT jinilkim forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances