Coronavirus research topics, tracking twenty years of research

Abstract Research publications aimed at understanding the various aspects of Coronaviruses, particularly COVID-19, have significantly shaped our knowledge base. While the urgency to monitor COVID-19 in real-time has decreased, the continual influx of new research of monthly articles underscores the...

Full description

Saved in:
Bibliographic Details
Main Authors: Amir Aryani, Jingbo Wang, Luis Salvador-Carulla, Jihoon Woo, Cathy P. W. Cheung, Zhuochen Wu, Hui Yin, Junhua Xiao, Elisabeth A. Lambert, Jason Howitt, Jean M. Davidson, Serene Yoong, John B. Dixon, Rachel E. Climie, Jose A. Salinas-Perez, Nasser Bagheri, Celine Santiago, Joanne Williams, Nilmini Wickramasinghe, Leo Ng, Clara C. Zwack, Gavin W. Lambert
Format: Article
Language:English
Published: Nature Portfolio 2025-06-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-025-04992-z
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849691610454949888
author Amir Aryani
Jingbo Wang
Luis Salvador-Carulla
Jihoon Woo
Cathy P. W. Cheung
Zhuochen Wu
Hui Yin
Junhua Xiao
Elisabeth A. Lambert
Jason Howitt
Jean M. Davidson
Serene Yoong
John B. Dixon
Rachel E. Climie
Jose A. Salinas-Perez
Nasser Bagheri
Celine Santiago
Joanne Williams
Nilmini Wickramasinghe
Leo Ng
Clara C. Zwack
Gavin W. Lambert
author_facet Amir Aryani
Jingbo Wang
Luis Salvador-Carulla
Jihoon Woo
Cathy P. W. Cheung
Zhuochen Wu
Hui Yin
Junhua Xiao
Elisabeth A. Lambert
Jason Howitt
Jean M. Davidson
Serene Yoong
John B. Dixon
Rachel E. Climie
Jose A. Salinas-Perez
Nasser Bagheri
Celine Santiago
Joanne Williams
Nilmini Wickramasinghe
Leo Ng
Clara C. Zwack
Gavin W. Lambert
author_sort Amir Aryani
collection DOAJ
description Abstract Research publications aimed at understanding the various aspects of Coronaviruses, particularly COVID-19, have significantly shaped our knowledge base. While the urgency to monitor COVID-19 in real-time has decreased, the continual influx of new research of monthly articles underscores the importance of systematic review and analysis to deepen our understanding of the pandemic’s broad impact. To explore research trends and innovations in this space, we developed a pipeline using natural language processing techniques. This pipeline systematically catalogues and synthesises the vast array of research articles, leading to the creation of a dataset with more than eight hundred thousand articles from July 2002 to May 2024. This paper describes the content of this dataset and provides the necessary information to make this dataset accessible and reusable for future research. Our approach aggregates and organises global research related to Coronaviruses into thematic clusters such as vaccine development, public health strategies, infection mechanisms, mental health issues, and economic consequences. Also, we have leveraged the contribution of health experts to review and revise the dataset.
format Article
id doaj-art-5b346bb03044481094dd796b145e27bf
institution DOAJ
issn 2052-4463
language English
publishDate 2025-06-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj-art-5b346bb03044481094dd796b145e27bf2025-08-20T03:20:59ZengNature PortfolioScientific Data2052-44632025-06-0112111710.1038/s41597-025-04992-zCoronavirus research topics, tracking twenty years of researchAmir Aryani0Jingbo Wang1Luis Salvador-Carulla2Jihoon Woo3Cathy P. W. Cheung4Zhuochen Wu5Hui Yin6Junhua Xiao7Elisabeth A. Lambert8Jason Howitt9Jean M. Davidson10Serene Yoong11John B. Dixon12Rachel E. Climie13Jose A. Salinas-Perez14Nasser Bagheri15Celine Santiago16Joanne Williams17Nilmini Wickramasinghe18Leo Ng19Clara C. Zwack20Gavin W. Lambert21Swinburne University of TechnologyNational Computational Infrastructure, The Australian National UniversityUniversity of CanberraSwinburne University of TechnologyAustralian National University (ANU)National Computational Infrastructure, The Australian National UniversitySwinburne University of TechnologySchool of Allied Health, La Trobe UniversitySwinburne University of TechnologySwinburne University of TechnologyCalifornia Polytechnic State UniversitySchool of Health and Social Development, Deakin UniversitySwinburne University of TechnologyUniversity of TasmaniaUniversidad Loyola AndalucíaUniversity of CanberraVictor Chang Cardiac Research InstituteSwinburne University of TechnologySwinburne University of TechnologySwinburne University of TechnologySwinburne University of TechnologySwinburne University of TechnologyAbstract Research publications aimed at understanding the various aspects of Coronaviruses, particularly COVID-19, have significantly shaped our knowledge base. While the urgency to monitor COVID-19 in real-time has decreased, the continual influx of new research of monthly articles underscores the importance of systematic review and analysis to deepen our understanding of the pandemic’s broad impact. To explore research trends and innovations in this space, we developed a pipeline using natural language processing techniques. This pipeline systematically catalogues and synthesises the vast array of research articles, leading to the creation of a dataset with more than eight hundred thousand articles from July 2002 to May 2024. This paper describes the content of this dataset and provides the necessary information to make this dataset accessible and reusable for future research. Our approach aggregates and organises global research related to Coronaviruses into thematic clusters such as vaccine development, public health strategies, infection mechanisms, mental health issues, and economic consequences. Also, we have leveraged the contribution of health experts to review and revise the dataset.https://doi.org/10.1038/s41597-025-04992-z
spellingShingle Amir Aryani
Jingbo Wang
Luis Salvador-Carulla
Jihoon Woo
Cathy P. W. Cheung
Zhuochen Wu
Hui Yin
Junhua Xiao
Elisabeth A. Lambert
Jason Howitt
Jean M. Davidson
Serene Yoong
John B. Dixon
Rachel E. Climie
Jose A. Salinas-Perez
Nasser Bagheri
Celine Santiago
Joanne Williams
Nilmini Wickramasinghe
Leo Ng
Clara C. Zwack
Gavin W. Lambert
Coronavirus research topics, tracking twenty years of research
Scientific Data
title Coronavirus research topics, tracking twenty years of research
title_full Coronavirus research topics, tracking twenty years of research
title_fullStr Coronavirus research topics, tracking twenty years of research
title_full_unstemmed Coronavirus research topics, tracking twenty years of research
title_short Coronavirus research topics, tracking twenty years of research
title_sort coronavirus research topics tracking twenty years of research
url https://doi.org/10.1038/s41597-025-04992-z
work_keys_str_mv AT amiraryani coronavirusresearchtopicstrackingtwentyyearsofresearch
AT jingbowang coronavirusresearchtopicstrackingtwentyyearsofresearch
AT luissalvadorcarulla coronavirusresearchtopicstrackingtwentyyearsofresearch
AT jihoonwoo coronavirusresearchtopicstrackingtwentyyearsofresearch
AT cathypwcheung coronavirusresearchtopicstrackingtwentyyearsofresearch
AT zhuochenwu coronavirusresearchtopicstrackingtwentyyearsofresearch
AT huiyin coronavirusresearchtopicstrackingtwentyyearsofresearch
AT junhuaxiao coronavirusresearchtopicstrackingtwentyyearsofresearch
AT elisabethalambert coronavirusresearchtopicstrackingtwentyyearsofresearch
AT jasonhowitt coronavirusresearchtopicstrackingtwentyyearsofresearch
AT jeanmdavidson coronavirusresearchtopicstrackingtwentyyearsofresearch
AT sereneyoong coronavirusresearchtopicstrackingtwentyyearsofresearch
AT johnbdixon coronavirusresearchtopicstrackingtwentyyearsofresearch
AT racheleclimie coronavirusresearchtopicstrackingtwentyyearsofresearch
AT joseasalinasperez coronavirusresearchtopicstrackingtwentyyearsofresearch
AT nasserbagheri coronavirusresearchtopicstrackingtwentyyearsofresearch
AT celinesantiago coronavirusresearchtopicstrackingtwentyyearsofresearch
AT joannewilliams coronavirusresearchtopicstrackingtwentyyearsofresearch
AT nilminiwickramasinghe coronavirusresearchtopicstrackingtwentyyearsofresearch
AT leong coronavirusresearchtopicstrackingtwentyyearsofresearch
AT claraczwack coronavirusresearchtopicstrackingtwentyyearsofresearch
AT gavinwlambert coronavirusresearchtopicstrackingtwentyyearsofresearch