GeneSetCluster 2.0: a comprehensive toolset for summarizing and integrating gene-sets analysis
Abstract Background Gene-Set Analysis (GSA) is commonly used to analyze high-throughput experiments. However, GSA cannot readily disentangle clusters or pathways due to redundancies in upstream knowledge bases, which hinders comprehensive exploration and interpretation of biological findings. To add...
Saved in:
| Main Authors: | , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-08-01
|
| Series: | BMC Bioinformatics |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s12859-025-06249-3 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849225820701196288 |
|---|---|
| author | Asier Ortega-Legarreta Alberto Maillo Daniel Mouzo Ana Rosa López-Pérez Lara Kular Majid Pahlevan Kakhki Maja Jagodic Jesper Tegner Vincenzo Lagani Ewoud Ewing David Gomez-Cabrero |
| author_facet | Asier Ortega-Legarreta Alberto Maillo Daniel Mouzo Ana Rosa López-Pérez Lara Kular Majid Pahlevan Kakhki Maja Jagodic Jesper Tegner Vincenzo Lagani Ewoud Ewing David Gomez-Cabrero |
| author_sort | Asier Ortega-Legarreta |
| collection | DOAJ |
| description | Abstract Background Gene-Set Analysis (GSA) is commonly used to analyze high-throughput experiments. However, GSA cannot readily disentangle clusters or pathways due to redundancies in upstream knowledge bases, which hinders comprehensive exploration and interpretation of biological findings. To address this challenge, we developed GeneSetCluster, an R package designed to summarize and integrate GSA results. Over time, we and users as well identified limitations in the original version, such as difficulties in managing redundancies across multiple gene-sets, large computational times, and its lack of accessibility for users without programming expertise. Results We present GeneSetCluster 2.0, a comprehensive upgrade that delivers methodological, computational, interpretative, and user-experience enhancements. Methodologically, GeneSetCluster 2.0 introduces a novel approach to address duplicated gene-sets and implements a seriation-based clustering algorithm that reorders results, aiding pattern identification. Computationally, the package is optimized for parallel processing, significantly reducing execution time. GeneSetCluster 2.0 enhances cluster annotations by associating clusters with relevant tissues and biological processes to improve biological interpretation, particularly for human and mouse data. To broaden accessibility, we have developed a user-friendly web application enabling non-programmers to use it. This version also ensures seamless integration between the R package, catering to users with programming expertise, and the web application for broader audiences. We evaluated the updates in a single-cell RNA public dataset. Conclusion GeneSetCluster 2.0 offers substantial improvements over its predecessor. Furthermore, by bridging the gap between bioinformaticians and clinicians in multidisciplinary teams, GeneSetCluster 2.0 facilitates collaborative research. The R package and web application, along with detailed installation and usage guides, are available on GitHub ( https://github.com/TranslationalBioinformaticsUnit/GeneSetCluster2.0 ), and the web application can be accessed at https://translationalbio.shinyapps.io/genesetcluster/ . |
| format | Article |
| id | doaj-art-1bca2362ac8e4128b847ea3099270155 |
| institution | Kabale University |
| issn | 1471-2105 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | BMC |
| record_format | Article |
| series | BMC Bioinformatics |
| spelling | doaj-art-1bca2362ac8e4128b847ea30992701552025-08-24T11:54:36ZengBMCBMC Bioinformatics1471-21052025-08-0126111710.1186/s12859-025-06249-3GeneSetCluster 2.0: a comprehensive toolset for summarizing and integrating gene-sets analysisAsier Ortega-Legarreta0Alberto Maillo1Daniel Mouzo2Ana Rosa López-Pérez3Lara Kular4Majid Pahlevan Kakhki5Maja Jagodic6Jesper Tegner7Vincenzo Lagani8Ewoud Ewing9David Gomez-Cabrero10Translational Bioinformatics Unit, Navarrabiomed, Hospital Universitario de Navarra (HUN), Universidad Pública de Navarra (UPNA), IdiSNABiological and Environmental Sciences and Engineering Division, King Abdullah University of Science and TechnologyTranslational Bioinformatics Unit, Navarrabiomed, Hospital Universitario de Navarra (HUN), Universidad Pública de Navarra (UPNA), IdiSNATranslational Bioinformatics Unit, Navarrabiomed, Hospital Universitario de Navarra (HUN), Universidad Pública de Navarra (UPNA), IdiSNADepartment of Clinical Neuroscience, Karolinska Institutet, and Center for Molecular Medicine, Karolinska University HospitalDepartment of Clinical Neuroscience, Karolinska Institutet, and Center for Molecular Medicine, Karolinska University HospitalDepartment of Clinical Neuroscience, Karolinska Institutet, and Center for Molecular Medicine, Karolinska University HospitalBiological and Environmental Sciences and Engineering Division, King Abdullah University of Science and TechnologyBiological and Environmental Sciences and Engineering Division, King Abdullah University of Science and TechnologyDepartment of Clinical Neuroscience, Karolinska Institutet, and Center for Molecular Medicine, Karolinska University HospitalBiological and Environmental Sciences and Engineering Division, King Abdullah University of Science and TechnologyAbstract Background Gene-Set Analysis (GSA) is commonly used to analyze high-throughput experiments. However, GSA cannot readily disentangle clusters or pathways due to redundancies in upstream knowledge bases, which hinders comprehensive exploration and interpretation of biological findings. To address this challenge, we developed GeneSetCluster, an R package designed to summarize and integrate GSA results. Over time, we and users as well identified limitations in the original version, such as difficulties in managing redundancies across multiple gene-sets, large computational times, and its lack of accessibility for users without programming expertise. Results We present GeneSetCluster 2.0, a comprehensive upgrade that delivers methodological, computational, interpretative, and user-experience enhancements. Methodologically, GeneSetCluster 2.0 introduces a novel approach to address duplicated gene-sets and implements a seriation-based clustering algorithm that reorders results, aiding pattern identification. Computationally, the package is optimized for parallel processing, significantly reducing execution time. GeneSetCluster 2.0 enhances cluster annotations by associating clusters with relevant tissues and biological processes to improve biological interpretation, particularly for human and mouse data. To broaden accessibility, we have developed a user-friendly web application enabling non-programmers to use it. This version also ensures seamless integration between the R package, catering to users with programming expertise, and the web application for broader audiences. We evaluated the updates in a single-cell RNA public dataset. Conclusion GeneSetCluster 2.0 offers substantial improvements over its predecessor. Furthermore, by bridging the gap between bioinformaticians and clinicians in multidisciplinary teams, GeneSetCluster 2.0 facilitates collaborative research. The R package and web application, along with detailed installation and usage guides, are available on GitHub ( https://github.com/TranslationalBioinformaticsUnit/GeneSetCluster2.0 ), and the web application can be accessed at https://translationalbio.shinyapps.io/genesetcluster/ .https://doi.org/10.1186/s12859-025-06249-3Gene-set analysisGene-set enrichment analysisFunctional annotationSeriation-based clusteringWeb applicationData-mining |
| spellingShingle | Asier Ortega-Legarreta Alberto Maillo Daniel Mouzo Ana Rosa López-Pérez Lara Kular Majid Pahlevan Kakhki Maja Jagodic Jesper Tegner Vincenzo Lagani Ewoud Ewing David Gomez-Cabrero GeneSetCluster 2.0: a comprehensive toolset for summarizing and integrating gene-sets analysis BMC Bioinformatics Gene-set analysis Gene-set enrichment analysis Functional annotation Seriation-based clustering Web application Data-mining |
| title | GeneSetCluster 2.0: a comprehensive toolset for summarizing and integrating gene-sets analysis |
| title_full | GeneSetCluster 2.0: a comprehensive toolset for summarizing and integrating gene-sets analysis |
| title_fullStr | GeneSetCluster 2.0: a comprehensive toolset for summarizing and integrating gene-sets analysis |
| title_full_unstemmed | GeneSetCluster 2.0: a comprehensive toolset for summarizing and integrating gene-sets analysis |
| title_short | GeneSetCluster 2.0: a comprehensive toolset for summarizing and integrating gene-sets analysis |
| title_sort | genesetcluster 2 0 a comprehensive toolset for summarizing and integrating gene sets analysis |
| topic | Gene-set analysis Gene-set enrichment analysis Functional annotation Seriation-based clustering Web application Data-mining |
| url | https://doi.org/10.1186/s12859-025-06249-3 |
| work_keys_str_mv | AT asierortegalegarreta genesetcluster20acomprehensivetoolsetforsummarizingandintegratinggenesetsanalysis AT albertomaillo genesetcluster20acomprehensivetoolsetforsummarizingandintegratinggenesetsanalysis AT danielmouzo genesetcluster20acomprehensivetoolsetforsummarizingandintegratinggenesetsanalysis AT anarosalopezperez genesetcluster20acomprehensivetoolsetforsummarizingandintegratinggenesetsanalysis AT larakular genesetcluster20acomprehensivetoolsetforsummarizingandintegratinggenesetsanalysis AT majidpahlevankakhki genesetcluster20acomprehensivetoolsetforsummarizingandintegratinggenesetsanalysis AT majajagodic genesetcluster20acomprehensivetoolsetforsummarizingandintegratinggenesetsanalysis AT jespertegner genesetcluster20acomprehensivetoolsetforsummarizingandintegratinggenesetsanalysis AT vincenzolagani genesetcluster20acomprehensivetoolsetforsummarizingandintegratinggenesetsanalysis AT ewoudewing genesetcluster20acomprehensivetoolsetforsummarizingandintegratinggenesetsanalysis AT davidgomezcabrero genesetcluster20acomprehensivetoolsetforsummarizingandintegratinggenesetsanalysis |