FANTASIA leverages language models to decode the functional dark proteome across the animal tree of life
Abstract Protein functional annotation is crucial in biology, but many protein-coding genes remain uncharacterized, especially in non-model organisms. FANTASIA (Functional ANnoTAtion based on embedding space SImilArity) integrates protein language models for large-scale functional annotation. Applie...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-08-01
|
| Series: | Communications Biology |
| Online Access: | https://doi.org/10.1038/s42003-025-08651-2 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849343570871320576 |
|---|---|
| author | Gemma I. Martínez-Redondo Francisco M. Perez-Canales Belén Carbonetto José M. Fernández Israel Barrios-Núñez Marçal Vázquez-Valls Ildefonso Cases Ana M. Rojas Rosa Fernández |
| author_facet | Gemma I. Martínez-Redondo Francisco M. Perez-Canales Belén Carbonetto José M. Fernández Israel Barrios-Núñez Marçal Vázquez-Valls Ildefonso Cases Ana M. Rojas Rosa Fernández |
| author_sort | Gemma I. Martínez-Redondo |
| collection | DOAJ |
| description | Abstract Protein functional annotation is crucial in biology, but many protein-coding genes remain uncharacterized, especially in non-model organisms. FANTASIA (Functional ANnoTAtion based on embedding space SImilArity) integrates protein language models for large-scale functional annotation. Applied to ~1000 animal proteomes, FANTASIA predicts functions to virtually all proteins, including up to 50% that remained unannotated by traditional homology-based methods. This enables the discovery of novel gene functions, enhancing our understanding of molecular evolution and organismal biology. FANTASIA holds particular promise for functional discovery in non-model taxa, offering advantages over homology-based tools in sensitivity and generalizability. FANTASIA is available on GitHub at https://github.com/CBBIO/FANTASIA . |
| format | Article |
| id | doaj-art-291f74d607be4a1a9eaedefec6d5b585 |
| institution | Kabale University |
| issn | 2399-3642 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Communications Biology |
| spelling | doaj-art-291f74d607be4a1a9eaedefec6d5b5852025-08-20T03:42:56ZengNature PortfolioCommunications Biology2399-36422025-08-01811810.1038/s42003-025-08651-2FANTASIA leverages language models to decode the functional dark proteome across the animal tree of lifeGemma I. Martínez-Redondo0Francisco M. Perez-Canales1Belén Carbonetto2José M. Fernández3Israel Barrios-Núñez4Marçal Vázquez-Valls5Ildefonso Cases6Ana M. Rojas7Rosa Fernández8Metazoa Phylogenomics and Genome Evolution Lab, Institute of Evolutionary Biology (CSIC-UPF)Computational Biology and Bioinformatics, Andalusian Center for Developmental Biology (CABD-CSIC)Metazoa Phylogenomics and Genome Evolution Lab, Institute of Evolutionary Biology (CSIC-UPF)Barcelona Supercomputing Center, Plaça d’Eusebi GüellComputational Biology and Bioinformatics, Andalusian Center for Developmental Biology (CABD-CSIC)Metazoa Phylogenomics and Genome Evolution Lab, Institute of Evolutionary Biology (CSIC-UPF)Computational Biology and Bioinformatics, Andalusian Center for Developmental Biology (CABD-CSIC)Computational Biology and Bioinformatics, Andalusian Center for Developmental Biology (CABD-CSIC)Metazoa Phylogenomics and Genome Evolution Lab, Institute of Evolutionary Biology (CSIC-UPF)Abstract Protein functional annotation is crucial in biology, but many protein-coding genes remain uncharacterized, especially in non-model organisms. FANTASIA (Functional ANnoTAtion based on embedding space SImilArity) integrates protein language models for large-scale functional annotation. Applied to ~1000 animal proteomes, FANTASIA predicts functions to virtually all proteins, including up to 50% that remained unannotated by traditional homology-based methods. This enables the discovery of novel gene functions, enhancing our understanding of molecular evolution and organismal biology. FANTASIA holds particular promise for functional discovery in non-model taxa, offering advantages over homology-based tools in sensitivity and generalizability. FANTASIA is available on GitHub at https://github.com/CBBIO/FANTASIA .https://doi.org/10.1038/s42003-025-08651-2 |
| spellingShingle | Gemma I. Martínez-Redondo Francisco M. Perez-Canales Belén Carbonetto José M. Fernández Israel Barrios-Núñez Marçal Vázquez-Valls Ildefonso Cases Ana M. Rojas Rosa Fernández FANTASIA leverages language models to decode the functional dark proteome across the animal tree of life Communications Biology |
| title | FANTASIA leverages language models to decode the functional dark proteome across the animal tree of life |
| title_full | FANTASIA leverages language models to decode the functional dark proteome across the animal tree of life |
| title_fullStr | FANTASIA leverages language models to decode the functional dark proteome across the animal tree of life |
| title_full_unstemmed | FANTASIA leverages language models to decode the functional dark proteome across the animal tree of life |
| title_short | FANTASIA leverages language models to decode the functional dark proteome across the animal tree of life |
| title_sort | fantasia leverages language models to decode the functional dark proteome across the animal tree of life |
| url | https://doi.org/10.1038/s42003-025-08651-2 |
| work_keys_str_mv | AT gemmaimartinezredondo fantasialeverageslanguagemodelstodecodethefunctionaldarkproteomeacrosstheanimaltreeoflife AT franciscomperezcanales fantasialeverageslanguagemodelstodecodethefunctionaldarkproteomeacrosstheanimaltreeoflife AT belencarbonetto fantasialeverageslanguagemodelstodecodethefunctionaldarkproteomeacrosstheanimaltreeoflife AT josemfernandez fantasialeverageslanguagemodelstodecodethefunctionaldarkproteomeacrosstheanimaltreeoflife AT israelbarriosnunez fantasialeverageslanguagemodelstodecodethefunctionaldarkproteomeacrosstheanimaltreeoflife AT marcalvazquezvalls fantasialeverageslanguagemodelstodecodethefunctionaldarkproteomeacrosstheanimaltreeoflife AT ildefonsocases fantasialeverageslanguagemodelstodecodethefunctionaldarkproteomeacrosstheanimaltreeoflife AT anamrojas fantasialeverageslanguagemodelstodecodethefunctionaldarkproteomeacrosstheanimaltreeoflife AT rosafernandez fantasialeverageslanguagemodelstodecodethefunctionaldarkproteomeacrosstheanimaltreeoflife |