ORF1ab codon frequency model predicts host-pathogen relationship in orthocoronavirinae
Predicting phenotypic properties of a virus directly from its sequence data is an attractive goal for viral epidemiology. Here, we focus narrowly on the Orthocoronavirinae clade and demonstrate models that are powerfully predictive for a human-pathogen phenotype with 76.74% average precision and 85....
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2025-03-01
|
| Series: | Frontiers in Bioinformatics |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/fbinf.2025.1562668/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849774165057339392 |
|---|---|
| author | Phillip E. Davis Joseph A. Russell |
| author_facet | Phillip E. Davis Joseph A. Russell |
| author_sort | Phillip E. Davis |
| collection | DOAJ |
| description | Predicting phenotypic properties of a virus directly from its sequence data is an attractive goal for viral epidemiology. Here, we focus narrowly on the Orthocoronavirinae clade and demonstrate models that are powerfully predictive for a human-pathogen phenotype with 76.74% average precision and 85.96% average recall on the withheld test set groups, using only Orf1ab codon frequencies. We show alternative examples for other viral coding sequences and feature representations that do not perform well and discuss what distinguishes the models that are performant. These models point to a small subset of features, specifically 5 codons, that are critical to the success of the models. We discuss and contextualize how this observation may fit within a larger model for the role of translation in virus-host agreement. |
| format | Article |
| id | doaj-art-5bf7b077e655408e9db884c77337b1d7 |
| institution | DOAJ |
| issn | 2673-7647 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | Frontiers Media S.A. |
| record_format | Article |
| series | Frontiers in Bioinformatics |
| spelling | doaj-art-5bf7b077e655408e9db884c77337b1d72025-08-20T03:01:49ZengFrontiers Media S.A.Frontiers in Bioinformatics2673-76472025-03-01510.3389/fbinf.2025.15626681562668ORF1ab codon frequency model predicts host-pathogen relationship in orthocoronavirinaePhillip E. DavisJoseph A. RussellPredicting phenotypic properties of a virus directly from its sequence data is an attractive goal for viral epidemiology. Here, we focus narrowly on the Orthocoronavirinae clade and demonstrate models that are powerfully predictive for a human-pathogen phenotype with 76.74% average precision and 85.96% average recall on the withheld test set groups, using only Orf1ab codon frequencies. We show alternative examples for other viral coding sequences and feature representations that do not perform well and discuss what distinguishes the models that are performant. These models point to a small subset of features, specifically 5 codons, that are critical to the success of the models. We discuss and contextualize how this observation may fit within a larger model for the role of translation in virus-host agreement.https://www.frontiersin.org/articles/10.3389/fbinf.2025.1562668/fullmachine learningfeature selectiongenotype-to-phenotypevirusesbioinformactics |
| spellingShingle | Phillip E. Davis Joseph A. Russell ORF1ab codon frequency model predicts host-pathogen relationship in orthocoronavirinae Frontiers in Bioinformatics machine learning feature selection genotype-to-phenotype viruses bioinformactics |
| title | ORF1ab codon frequency model predicts host-pathogen relationship in orthocoronavirinae |
| title_full | ORF1ab codon frequency model predicts host-pathogen relationship in orthocoronavirinae |
| title_fullStr | ORF1ab codon frequency model predicts host-pathogen relationship in orthocoronavirinae |
| title_full_unstemmed | ORF1ab codon frequency model predicts host-pathogen relationship in orthocoronavirinae |
| title_short | ORF1ab codon frequency model predicts host-pathogen relationship in orthocoronavirinae |
| title_sort | orf1ab codon frequency model predicts host pathogen relationship in orthocoronavirinae |
| topic | machine learning feature selection genotype-to-phenotype viruses bioinformactics |
| url | https://www.frontiersin.org/articles/10.3389/fbinf.2025.1562668/full |
| work_keys_str_mv | AT phillipedavis orf1abcodonfrequencymodelpredictshostpathogenrelationshipinorthocoronavirinae AT josepharussell orf1abcodonfrequencymodelpredictshostpathogenrelationshipinorthocoronavirinae |