Leveraging viral genome sequences and machine learning models for identification of potentially selective antiviral agents

Abstract Viral genome sequencing provides valuable information for antiviral development, yet its integration with machine learning for virtual screening remains underexplored. To bridge this gap, viral genome sequences were combined with structural data of approved and investigational antivirals to...

Full description

Saved in:
Bibliographic Details
Main Authors: Tuan Xu, Miao Xu, Qi Zhang, Catherine Z. Chen, Wei Zheng, Ruili Huang
Format: Article
Language:English
Published: Nature Portfolio 2025-06-01
Series:Communications Chemistry
Online Access:https://doi.org/10.1038/s42004-025-01583-2
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Viral genome sequencing provides valuable information for antiviral development, yet its integration with machine learning for virtual screening remains underexplored. To bridge this gap, viral genome sequences were combined with structural data of approved and investigational antivirals to identify virus-selective agents. In parallel, quantitative structure-activity relationship (QSAR) models were built to predict pan-antivirals. Robust models were generated with the area under the receiver operating characteristic curve (AUC-ROC) >0.72 for virus-selective and >0.79 for pan-antiviral predictions. These models were applied to virtually screen ~360 K compounds for anti-SARS-CoV-2 activity. The 346 compounds identified by the models were tested using two in vitro assays, yielding hit rates of 9.4% (24/256) in the pseudotyped particle (PP) entry assay and 37% (47/128) in the RNA-dependent RNA polymerase (RdRp) assay. The top compounds showed potencies around 1 µM. This study provides a framework for virtual screening of virus-selective and pan- antivirals against emerging pathogens.
ISSN:2399-3669