Nocardia genomes are a large reservoir of diverse gene content, biosynthetic gene clusters, and species-specific genes

ABSTRACT The Nocardia genus represents a largely untapped source of valuable secondary metabolites, yet its biosynthetic potential, gene content, and evolutionary history remain underexplored. By analyzing 263 genomes across 88 species, we found that Nocardia varies greatly in genome size and gene c...

Full description

Saved in:
Bibliographic Details
Main Authors: Kiran Kumar Eripogu, Chun-Ping Yu, An-I Tsai, Jinn-Jy Lin, Hsiao-Ching Lin, Wen-Hsiung Li
Format: Article
Language:English
Published: American Society for Microbiology 2025-06-01
Series:mBio
Subjects:
Online Access:https://journals.asm.org/doi/10.1128/mbio.00947-25
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849331368322924544
author Kiran Kumar Eripogu
Chun-Ping Yu
An-I Tsai
Jinn-Jy Lin
Hsiao-Ching Lin
Wen-Hsiung Li
author_facet Kiran Kumar Eripogu
Chun-Ping Yu
An-I Tsai
Jinn-Jy Lin
Hsiao-Ching Lin
Wen-Hsiung Li
author_sort Kiran Kumar Eripogu
collection DOAJ
description ABSTRACT The Nocardia genus represents a largely untapped source of valuable secondary metabolites, yet its biosynthetic potential, gene content, and evolutionary history remain underexplored. By analyzing 263 genomes across 88 species, we found that Nocardia varies greatly in genome size and gene content. It exhibits an open pangenome with a small core genome (<900 genes) and high genomic fluidity (0.76), indicating high gene turnover. A large proportion (75%) of its genes are species-specific, indicating high genomic plasticity. Average nucleotide identity (ANI) analysis confirmed taxonomic relationships, with most species showing ANI values (80–85%). N. globerula showed an ANI of ~84% with Rhodococcus erythropolis, supporting its reclassification under Rhodococcus. The biosynthetic capabilities of the Nocardia genus are striking, with the presence of >8,000 biosynthetic gene clusters (BGCs), dominated by type 1 polyketide synthase, terpenes, and non-ribosomal polypeptide synthetases, establishing Nocardia as the Actinomycetota genus that has the largest biosynthetic repertoire. Around 35% of BGCs remain uncharacterized, suggesting Nocardia’s high potential for novel natural product discoveries. Our study is the first to identify a prodigiosin BGC in Nocardia. Network analysis revealed complex evolutionary connections between Nocardia’s gene cluster families (GCFs) and MIBiG reference BGCs, suggesting evolutionary changes, including gene gains and losses, that may have influenced the genus’s BGC diversity and composition. Synteny analysis uncovered conserved and unique gene arrangements across Nocardia and related genera, mostly with core genes conserved in Actinomycetota. Our study addressed unmet clinical and biotechnological challenges while revealing evolutionary mechanisms that shape microbial diversity and adaptability.IMPORTANCEUnderstanding the genomic diversity and biosynthetic potential of microorganisms is instrumental for addressing issues in microbial evolution, natural product discovery, and host-microbe interactions. Nocardia, a bacterial genus known for its opportunistic pathogenicity, represents an underexplored group of immense genomic diversity and biosynthetic capabilities. This study employed genome mining to reveal the open pangenome of Nocardia and identified an extensive repertoire of BGCs, including novel clusters with the potential to produce therapeutically significant compounds such as prodigiosin-related compounds. By integrating genome mining, phylogenetics, and synteny analysis, this study provides insights into how genomic plasticity, species-specific genes, and evolutionary changes such as gene gains and losses that contribute to Nocardia's biosynthetic diversity and evolution. These findings contribute to advancing microbial genomics, evolution, and biotechnology by uncovering the potential of Nocardia to address challenges in infectious diseases and natural product discovery. This study exemplifies how genome mining can illuminate the ecological and clinical significance of microbial diversity.
format Article
id doaj-art-94c7bf7340af41c8ac68b1abddc0a92e
institution Kabale University
issn 2150-7511
language English
publishDate 2025-06-01
publisher American Society for Microbiology
record_format Article
series mBio
spelling doaj-art-94c7bf7340af41c8ac68b1abddc0a92e2025-08-20T03:46:38ZengAmerican Society for MicrobiologymBio2150-75112025-06-0116610.1128/mbio.00947-25Nocardia genomes are a large reservoir of diverse gene content, biosynthetic gene clusters, and species-specific genesKiran Kumar Eripogu0Chun-Ping Yu1An-I Tsai2Jinn-Jy Lin3Hsiao-Ching Lin4Wen-Hsiung Li5Biodiversity Program, Taiwan International Graduate Program, Academia Sinica, Taipei, TaiwanBiodiversity Research Center, Academia Sinica (BRCAS), Taipei City, TaiwanBiodiversity Research Center, Academia Sinica (BRCAS), Taipei City, TaiwanNational Center for High-performance Computing, National Applied Research Laboratories, Hsinchu, TaiwanInstitute of Biological Chemistry, Academia Sinica, Taipei City, TaiwanBiodiversity Program, Taiwan International Graduate Program, Academia Sinica, Taipei, TaiwanABSTRACT The Nocardia genus represents a largely untapped source of valuable secondary metabolites, yet its biosynthetic potential, gene content, and evolutionary history remain underexplored. By analyzing 263 genomes across 88 species, we found that Nocardia varies greatly in genome size and gene content. It exhibits an open pangenome with a small core genome (<900 genes) and high genomic fluidity (0.76), indicating high gene turnover. A large proportion (75%) of its genes are species-specific, indicating high genomic plasticity. Average nucleotide identity (ANI) analysis confirmed taxonomic relationships, with most species showing ANI values (80–85%). N. globerula showed an ANI of ~84% with Rhodococcus erythropolis, supporting its reclassification under Rhodococcus. The biosynthetic capabilities of the Nocardia genus are striking, with the presence of >8,000 biosynthetic gene clusters (BGCs), dominated by type 1 polyketide synthase, terpenes, and non-ribosomal polypeptide synthetases, establishing Nocardia as the Actinomycetota genus that has the largest biosynthetic repertoire. Around 35% of BGCs remain uncharacterized, suggesting Nocardia’s high potential for novel natural product discoveries. Our study is the first to identify a prodigiosin BGC in Nocardia. Network analysis revealed complex evolutionary connections between Nocardia’s gene cluster families (GCFs) and MIBiG reference BGCs, suggesting evolutionary changes, including gene gains and losses, that may have influenced the genus’s BGC diversity and composition. Synteny analysis uncovered conserved and unique gene arrangements across Nocardia and related genera, mostly with core genes conserved in Actinomycetota. Our study addressed unmet clinical and biotechnological challenges while revealing evolutionary mechanisms that shape microbial diversity and adaptability.IMPORTANCEUnderstanding the genomic diversity and biosynthetic potential of microorganisms is instrumental for addressing issues in microbial evolution, natural product discovery, and host-microbe interactions. Nocardia, a bacterial genus known for its opportunistic pathogenicity, represents an underexplored group of immense genomic diversity and biosynthetic capabilities. This study employed genome mining to reveal the open pangenome of Nocardia and identified an extensive repertoire of BGCs, including novel clusters with the potential to produce therapeutically significant compounds such as prodigiosin-related compounds. By integrating genome mining, phylogenetics, and synteny analysis, this study provides insights into how genomic plasticity, species-specific genes, and evolutionary changes such as gene gains and losses that contribute to Nocardia's biosynthetic diversity and evolution. These findings contribute to advancing microbial genomics, evolution, and biotechnology by uncovering the potential of Nocardia to address challenges in infectious diseases and natural product discovery. This study exemplifies how genome mining can illuminate the ecological and clinical significance of microbial diversity.https://journals.asm.org/doi/10.1128/mbio.00947-25Nocardiagenome miningopen pangenomebiosynthetic pathways
spellingShingle Kiran Kumar Eripogu
Chun-Ping Yu
An-I Tsai
Jinn-Jy Lin
Hsiao-Ching Lin
Wen-Hsiung Li
Nocardia genomes are a large reservoir of diverse gene content, biosynthetic gene clusters, and species-specific genes
mBio
Nocardia
genome mining
open pangenome
biosynthetic pathways
title Nocardia genomes are a large reservoir of diverse gene content, biosynthetic gene clusters, and species-specific genes
title_full Nocardia genomes are a large reservoir of diverse gene content, biosynthetic gene clusters, and species-specific genes
title_fullStr Nocardia genomes are a large reservoir of diverse gene content, biosynthetic gene clusters, and species-specific genes
title_full_unstemmed Nocardia genomes are a large reservoir of diverse gene content, biosynthetic gene clusters, and species-specific genes
title_short Nocardia genomes are a large reservoir of diverse gene content, biosynthetic gene clusters, and species-specific genes
title_sort nocardia genomes are a large reservoir of diverse gene content biosynthetic gene clusters and species specific genes
topic Nocardia
genome mining
open pangenome
biosynthetic pathways
url https://journals.asm.org/doi/10.1128/mbio.00947-25
work_keys_str_mv AT kirankumareripogu nocardiagenomesarealargereservoirofdiversegenecontentbiosyntheticgeneclustersandspeciesspecificgenes
AT chunpingyu nocardiagenomesarealargereservoirofdiversegenecontentbiosyntheticgeneclustersandspeciesspecificgenes
AT anitsai nocardiagenomesarealargereservoirofdiversegenecontentbiosyntheticgeneclustersandspeciesspecificgenes
AT jinnjylin nocardiagenomesarealargereservoirofdiversegenecontentbiosyntheticgeneclustersandspeciesspecificgenes
AT hsiaochinglin nocardiagenomesarealargereservoirofdiversegenecontentbiosyntheticgeneclustersandspeciesspecificgenes
AT wenhsiungli nocardiagenomesarealargereservoirofdiversegenecontentbiosyntheticgeneclustersandspeciesspecificgenes