Towards a standard benchmark for phenotype-driven variant and gene prioritisation algorithms: PhEval - Phenotypic inference Evaluation framework
Abstract Background: Computational approaches to support rare disease diagnosis are challenging to build, requiring the integration of complex data types such as ontologies, gene-to-phenotype associations, and cross-species data into variant and gene prioritisation algorithms (VGPAs). However, the p...
Saved in:
| Main Authors: | , , , , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-03-01
|
| Series: | BMC Bioinformatics |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s12859-025-06105-4 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849389987568549888 |
|---|---|
| author | Yasemin Bridges Vinicius de Souza Katherina G. Cortes Melissa Haendel Nomi L. Harris Daniel R. Korn Nikolaos M. Marinakis Nicolas Matentzoglu James A. McLaughlin Christopher J. Mungall Aaron Odell David Osumi-Sutherland Peter N. Robinson Damian Smedley Julius O. B. Jacobsen |
| author_facet | Yasemin Bridges Vinicius de Souza Katherina G. Cortes Melissa Haendel Nomi L. Harris Daniel R. Korn Nikolaos M. Marinakis Nicolas Matentzoglu James A. McLaughlin Christopher J. Mungall Aaron Odell David Osumi-Sutherland Peter N. Robinson Damian Smedley Julius O. B. Jacobsen |
| author_sort | Yasemin Bridges |
| collection | DOAJ |
| description | Abstract Background: Computational approaches to support rare disease diagnosis are challenging to build, requiring the integration of complex data types such as ontologies, gene-to-phenotype associations, and cross-species data into variant and gene prioritisation algorithms (VGPAs). However, the performance of VGPAs has been difficult to measure and is impacted by many factors, for example, ontology structure, annotation completeness or changes to the underlying algorithm. Assertions of the capabilities of VGPAs are often not reproducible, in part because there is no standardised, empirical framework and openly available patient data to assess the efficacy of VGPAs—ultimately hindering the development of effective prioritisation tools. Results: In this paper, we present our benchmarking tool, PhEval, which aims to provide a standardised and empirical framework to evaluate phenotype-driven VGPAs. The inclusion of standardised test corpora and test corpus generation tools in the PhEval suite of tools allows open benchmarking and comparison of methods on standardised data sets. Conclusions: PhEval and the standardised test corpora solve the issues of patient data availability and experimental tooling configuration when benchmarking and comparing rare disease VGPAs. By providing standardised data on patient cohorts from real-world case-reports and controlling the configuration of evaluated VGPAs, PhEval enables transparent, portable, comparable and reproducible benchmarking of VGPAs. As these tools are often a key component of many rare disease diagnostic pipelines, a thorough and standardised method of assessment is essential for improving patient diagnosis and care |
| format | Article |
| id | doaj-art-bb1e55d7dbdf4cbd8c7b908b67733505 |
| institution | Kabale University |
| issn | 1471-2105 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | BMC |
| record_format | Article |
| series | BMC Bioinformatics |
| spelling | doaj-art-bb1e55d7dbdf4cbd8c7b908b677335052025-08-20T03:41:47ZengBMCBMC Bioinformatics1471-21052025-03-0126111810.1186/s12859-025-06105-4Towards a standard benchmark for phenotype-driven variant and gene prioritisation algorithms: PhEval - Phenotypic inference Evaluation frameworkYasemin Bridges0Vinicius de Souza1Katherina G. Cortes2Melissa Haendel3Nomi L. Harris4Daniel R. Korn5Nikolaos M. Marinakis6Nicolas Matentzoglu7James A. McLaughlin8Christopher J. Mungall9Aaron Odell10David Osumi-Sutherland11Peter N. Robinson12Damian Smedley13Julius O. B. Jacobsen14William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of LondonEuropean Bioinformatics Institute (EMBL-EBI)School of Public Health, University of Colorado Anschutz Medical CampusDepartment of Genetics, University of North Carolina, Chapel HillDivision of Environmental Genomics and Systems Biology, Lawrence Berkeley National LaboratoryDepartment of Genetics, University of North Carolina, Chapel HillLaboratory of Medical Genetics, National and Kapodistrian University of AthensSemanticlySamples, Phenotypes, and Ontologies (SPOT), European Bioinformatics Institute (EMBL-EBI)Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National LaboratoryDepartment of Genetics, University of North Carolina, Chapel HillWellcome Trust Sanger InstituteBerlin Institute of Health, Charité – Universitätsmedizin BerlinWilliam Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of LondonWilliam Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of LondonAbstract Background: Computational approaches to support rare disease diagnosis are challenging to build, requiring the integration of complex data types such as ontologies, gene-to-phenotype associations, and cross-species data into variant and gene prioritisation algorithms (VGPAs). However, the performance of VGPAs has been difficult to measure and is impacted by many factors, for example, ontology structure, annotation completeness or changes to the underlying algorithm. Assertions of the capabilities of VGPAs are often not reproducible, in part because there is no standardised, empirical framework and openly available patient data to assess the efficacy of VGPAs—ultimately hindering the development of effective prioritisation tools. Results: In this paper, we present our benchmarking tool, PhEval, which aims to provide a standardised and empirical framework to evaluate phenotype-driven VGPAs. The inclusion of standardised test corpora and test corpus generation tools in the PhEval suite of tools allows open benchmarking and comparison of methods on standardised data sets. Conclusions: PhEval and the standardised test corpora solve the issues of patient data availability and experimental tooling configuration when benchmarking and comparing rare disease VGPAs. By providing standardised data on patient cohorts from real-world case-reports and controlling the configuration of evaluated VGPAs, PhEval enables transparent, portable, comparable and reproducible benchmarking of VGPAs. As these tools are often a key component of many rare disease diagnostic pipelines, a thorough and standardised method of assessment is essential for improving patient diagnosis and carehttps://doi.org/10.1186/s12859-025-06105-4Variant prioritisationPhenopacketsBenchmarking FrameworkPhenotype-driven analysisBioinformaticsRare disease diagnosis |
| spellingShingle | Yasemin Bridges Vinicius de Souza Katherina G. Cortes Melissa Haendel Nomi L. Harris Daniel R. Korn Nikolaos M. Marinakis Nicolas Matentzoglu James A. McLaughlin Christopher J. Mungall Aaron Odell David Osumi-Sutherland Peter N. Robinson Damian Smedley Julius O. B. Jacobsen Towards a standard benchmark for phenotype-driven variant and gene prioritisation algorithms: PhEval - Phenotypic inference Evaluation framework BMC Bioinformatics Variant prioritisation Phenopackets Benchmarking Framework Phenotype-driven analysis Bioinformatics Rare disease diagnosis |
| title | Towards a standard benchmark for phenotype-driven variant and gene prioritisation algorithms: PhEval - Phenotypic inference Evaluation framework |
| title_full | Towards a standard benchmark for phenotype-driven variant and gene prioritisation algorithms: PhEval - Phenotypic inference Evaluation framework |
| title_fullStr | Towards a standard benchmark for phenotype-driven variant and gene prioritisation algorithms: PhEval - Phenotypic inference Evaluation framework |
| title_full_unstemmed | Towards a standard benchmark for phenotype-driven variant and gene prioritisation algorithms: PhEval - Phenotypic inference Evaluation framework |
| title_short | Towards a standard benchmark for phenotype-driven variant and gene prioritisation algorithms: PhEval - Phenotypic inference Evaluation framework |
| title_sort | towards a standard benchmark for phenotype driven variant and gene prioritisation algorithms pheval phenotypic inference evaluation framework |
| topic | Variant prioritisation Phenopackets Benchmarking Framework Phenotype-driven analysis Bioinformatics Rare disease diagnosis |
| url | https://doi.org/10.1186/s12859-025-06105-4 |
| work_keys_str_mv | AT yaseminbridges towardsastandardbenchmarkforphenotypedrivenvariantandgeneprioritisationalgorithmsphevalphenotypicinferenceevaluationframework AT viniciusdesouza towardsastandardbenchmarkforphenotypedrivenvariantandgeneprioritisationalgorithmsphevalphenotypicinferenceevaluationframework AT katherinagcortes towardsastandardbenchmarkforphenotypedrivenvariantandgeneprioritisationalgorithmsphevalphenotypicinferenceevaluationframework AT melissahaendel towardsastandardbenchmarkforphenotypedrivenvariantandgeneprioritisationalgorithmsphevalphenotypicinferenceevaluationframework AT nomilharris towardsastandardbenchmarkforphenotypedrivenvariantandgeneprioritisationalgorithmsphevalphenotypicinferenceevaluationframework AT danielrkorn towardsastandardbenchmarkforphenotypedrivenvariantandgeneprioritisationalgorithmsphevalphenotypicinferenceevaluationframework AT nikolaosmmarinakis towardsastandardbenchmarkforphenotypedrivenvariantandgeneprioritisationalgorithmsphevalphenotypicinferenceevaluationframework AT nicolasmatentzoglu towardsastandardbenchmarkforphenotypedrivenvariantandgeneprioritisationalgorithmsphevalphenotypicinferenceevaluationframework AT jamesamclaughlin towardsastandardbenchmarkforphenotypedrivenvariantandgeneprioritisationalgorithmsphevalphenotypicinferenceevaluationframework AT christopherjmungall towardsastandardbenchmarkforphenotypedrivenvariantandgeneprioritisationalgorithmsphevalphenotypicinferenceevaluationframework AT aaronodell towardsastandardbenchmarkforphenotypedrivenvariantandgeneprioritisationalgorithmsphevalphenotypicinferenceevaluationframework AT davidosumisutherland towardsastandardbenchmarkforphenotypedrivenvariantandgeneprioritisationalgorithmsphevalphenotypicinferenceevaluationframework AT peternrobinson towardsastandardbenchmarkforphenotypedrivenvariantandgeneprioritisationalgorithmsphevalphenotypicinferenceevaluationframework AT damiansmedley towardsastandardbenchmarkforphenotypedrivenvariantandgeneprioritisationalgorithmsphevalphenotypicinferenceevaluationframework AT juliusobjacobsen towardsastandardbenchmarkforphenotypedrivenvariantandgeneprioritisationalgorithmsphevalphenotypicinferenceevaluationframework |