Highly accurate prophage island detection with PIDE

Abstract As important mobile elements in prokaryotes, prophages shape the genomic context of their hosts and regulate the structure of bacterial populations. However, it is challenging to precisely identify prophages through computational methods. Here, we introduce PIDE for identifying prophages fr...

Full description

Saved in:
Bibliographic Details
Main Authors: Hongyan Gao, Bowen Li, Zihan Guo, Lei Zheng, Junnan Chen, Guanxiang Liang
Format: Article
Language:English
Published: BMC 2025-08-01
Series:Genome Biology
Subjects:
Online Access:https://doi.org/10.1186/s13059-025-03733-0
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract As important mobile elements in prokaryotes, prophages shape the genomic context of their hosts and regulate the structure of bacterial populations. However, it is challenging to precisely identify prophages through computational methods. Here, we introduce PIDE for identifying prophages from bacterial genomes or metagenome-assembled genomes. PIDE integrates a pre-trained protein language model and gene density clustering algorithm to distinguish prophages. Benchmarking with induced prophage sequencing datasets demonstrates that PIDE pinpoints prophages with precise boundaries. Applying PIDE to 4744 human gut representative genomes reveals 24,467 prophages with widespread functional capacity. PIDE is available at https://github.com/chyghy/PIDE , with model training code at https://zenodo.org/records/16457629 .
ISSN:1474-760X