Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distances
IntroductionSevere acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of the global coronavirus disease 2019 (COVID-19) pandemic and continues to drive successive waves of infection through the emergence of novel variants. Consequently, accurately predicting the next clade...
Saved in:
| Main Authors: | , , , , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2025-06-01
|
| Series: | Frontiers in Microbiology |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/fmicb.2025.1619546/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850212199900905472 |
|---|---|
| author | Kyuyoung Lee Atanas V. Demirev Sangyi Lee Seunghye Cho Hyunbeen Kim Junhyung Cho Jeong-Sun Yang Kyung-Chang Kim Joo-Yeon Lee Woojin Shin Soyoung Lee Sejik Park Philippe Lemey Man-Seong Park Man-Seong Park Man-Seong Park Jin Il Kim Jin Il Kim Jin Il Kim |
| author_facet | Kyuyoung Lee Atanas V. Demirev Sangyi Lee Seunghye Cho Hyunbeen Kim Junhyung Cho Jeong-Sun Yang Kyung-Chang Kim Joo-Yeon Lee Woojin Shin Soyoung Lee Sejik Park Philippe Lemey Man-Seong Park Man-Seong Park Man-Seong Park Jin Il Kim Jin Il Kim Jin Il Kim |
| author_sort | Kyuyoung Lee |
| collection | DOAJ |
| description | IntroductionSevere acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of the global coronavirus disease 2019 (COVID-19) pandemic and continues to drive successive waves of infection through the emergence of novel variants. Consequently, accurately predicting the next clade roots through global surveillance is crucial for effective prevention, control, and timely updates of vaccine antigen updates. This study evaluated the evolutionary dynamics of SARS-CoV-2 using phylogeny-informed genetic distances based on 394 complete genomes and spike (S) gene sequences. Furthermore, we introduced a forecasting framework to estimate the potential of emerging variants leading to clade replacement by analyzing non-synonymous and synonymous genetic distances from clade roots, which reflect global herd immune pressure.MethodsNon-synonymous and synonymous genetic distances from both Wuhan and clade root strains were assessed to predict whether a clade would become dominant or extinct within 3 months before the clade replacement.ResultsThrough five observed clade replacements up to January 2024, we captured the quantifiable heterogeneity in non-synonymous and synonymous genetic distances of the S gene from clade roots between dominant and extinct variants, as measured by the extent of novelty, whether through gradual or drastic change.DiscussionOur framework demonstrated high predictability for identifying the next clade root before replacement in both training and test datasets (area under the receiver operating characteristic curve [AUROC] > 0.90) by incorporating differential weighting of non-synonymous and synonymous genetic distances. Additionally, the framework solely using spike gene data demonstrated similar accuracy to those using the complete genome. Overall, our approach establishes quantifiable molecular criteria for identifying potential updates to the SARS-CoV-2 vaccine, contributing to proactive pandemic preparedness. |
| format | Article |
| id | doaj-art-050dea4769ac4dbc9eba915da4494b97 |
| institution | OA Journals |
| issn | 1664-302X |
| language | English |
| publishDate | 2025-06-01 |
| publisher | Frontiers Media S.A. |
| record_format | Article |
| series | Frontiers in Microbiology |
| spelling | doaj-art-050dea4769ac4dbc9eba915da4494b972025-08-20T02:09:24ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2025-06-011610.3389/fmicb.2025.16195461619546Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distancesKyuyoung Lee0Atanas V. Demirev1Sangyi Lee2Seunghye Cho3Hyunbeen Kim4Junhyung Cho5Jeong-Sun Yang6Kyung-Chang Kim7Joo-Yeon Lee8Woojin Shin9Soyoung Lee10Sejik Park11Philippe Lemey12Man-Seong Park13Man-Seong Park14Man-Seong Park15Jin Il Kim16Jin Il Kim17Jin Il Kim18Department of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaDivision of Emerging Viral Diseases and Vector Research, Center for Infectious Diseases Research, National Institute of Infectious Diseases, Korea National Institute of Health, Osong, Republic of KoreaDivision of Emerging Viral Diseases and Vector Research, Center for Infectious Diseases Research, National Institute of Infectious Diseases, Korea National Institute of Health, Osong, Republic of KoreaDivision of Emerging Viral Diseases and Vector Research, Center for Infectious Diseases Research, National Institute of Infectious Diseases, Korea National Institute of Health, Osong, Republic of KoreaCenter for Infectious Diseases Research, National Institute of Infectious Diseases, Korea National Institute of Health, Osong, Republic of KoreaDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaDepartment of Microbiology, Immunology, and Transplantation, Rega Institute, KU Leuven, Leuven, BelgiumDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaVaccine Innovation Center, Korea University College of Medicine, Seoul, Republic of KoreaBiosafety Center, Korea University College of Medicine, Seoul, Republic of KoreaDepartment of Microbiology, Institute for Viral Diseases, Korea University College of Medicine, Seoul, Republic of KoreaVaccine Innovation Center, Korea University College of Medicine, Seoul, Republic of KoreaBiosafety Center, Korea University College of Medicine, Seoul, Republic of KoreaIntroductionSevere acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of the global coronavirus disease 2019 (COVID-19) pandemic and continues to drive successive waves of infection through the emergence of novel variants. Consequently, accurately predicting the next clade roots through global surveillance is crucial for effective prevention, control, and timely updates of vaccine antigen updates. This study evaluated the evolutionary dynamics of SARS-CoV-2 using phylogeny-informed genetic distances based on 394 complete genomes and spike (S) gene sequences. Furthermore, we introduced a forecasting framework to estimate the potential of emerging variants leading to clade replacement by analyzing non-synonymous and synonymous genetic distances from clade roots, which reflect global herd immune pressure.MethodsNon-synonymous and synonymous genetic distances from both Wuhan and clade root strains were assessed to predict whether a clade would become dominant or extinct within 3 months before the clade replacement.ResultsThrough five observed clade replacements up to January 2024, we captured the quantifiable heterogeneity in non-synonymous and synonymous genetic distances of the S gene from clade roots between dominant and extinct variants, as measured by the extent of novelty, whether through gradual or drastic change.DiscussionOur framework demonstrated high predictability for identifying the next clade root before replacement in both training and test datasets (area under the receiver operating characteristic curve [AUROC] > 0.90) by incorporating differential weighting of non-synonymous and synonymous genetic distances. Additionally, the framework solely using spike gene data demonstrated similar accuracy to those using the complete genome. Overall, our approach establishes quantifiable molecular criteria for identifying potential updates to the SARS-CoV-2 vaccine, contributing to proactive pandemic preparedness.https://www.frontiersin.org/articles/10.3389/fmicb.2025.1619546/fullSARS-CoV-2evolutionclade replacementforecasting frameworkspike genedominance |
| spellingShingle | Kyuyoung Lee Atanas V. Demirev Sangyi Lee Seunghye Cho Hyunbeen Kim Junhyung Cho Jeong-Sun Yang Kyung-Chang Kim Joo-Yeon Lee Woojin Shin Soyoung Lee Sejik Park Philippe Lemey Man-Seong Park Man-Seong Park Man-Seong Park Jin Il Kim Jin Il Kim Jin Il Kim Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distances Frontiers in Microbiology SARS-CoV-2 evolution clade replacement forecasting framework spike gene dominance |
| title | Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distances |
| title_full | Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distances |
| title_fullStr | Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distances |
| title_full_unstemmed | Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distances |
| title_short | Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distances |
| title_sort | forecasting framework for dominant sars cov 2 strains before clade replacement using phylogeny informed genetic distances |
| topic | SARS-CoV-2 evolution clade replacement forecasting framework spike gene dominance |
| url | https://www.frontiersin.org/articles/10.3389/fmicb.2025.1619546/full |
| work_keys_str_mv | AT kyuyounglee forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT atanasvdemirev forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT sangyilee forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT seunghyecho forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT hyunbeenkim forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT junhyungcho forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT jeongsunyang forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT kyungchangkim forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT jooyeonlee forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT woojinshin forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT soyounglee forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT sejikpark forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT philippelemey forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT manseongpark forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT manseongpark forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT manseongpark forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT jinilkim forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT jinilkim forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances AT jinilkim forecastingframeworkfordominantsarscov2strainsbeforecladereplacementusingphylogenyinformedgeneticdistances |