Prediction of transcript isoforms and identification of tissue-specific genes in cucumber

Abstract Background Identification of global transcriptional events is crucial for genome annotation, as accurate annotation enhances the efficiency and comparability of genomic information across species. However, the annotation of transcripts in the cucumber genome remains to be improved, and many...

Full description

Saved in:
Bibliographic Details
Main Authors: Wenjiao Wang, Chengcheng Shen, Xinqiang Wen, Anqi Li, Qi Gao, Zhaoying Xu, Yuping Wei, Yushun Li, Dailu Guan, Bin Liu
Format: Article
Language:English
Published: BMC 2025-01-01
Series:BMC Genomics
Subjects:
Online Access:https://doi.org/10.1186/s12864-025-11212-w
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background Identification of global transcriptional events is crucial for genome annotation, as accurate annotation enhances the efficiency and comparability of genomic information across species. However, the annotation of transcripts in the cucumber genome remains to be improved, and many transcriptional events have not been well studied. Results We collected 1,904 high-quality public cucumber transcriptome samples from the National Center for Biotechnology Information (NCBI) to identify and annotate transcript isoforms in the cucumber genome. Over 44.26 billion Q30 clean reads were mapped to the cucumber genome with an average mapping rate of 92.75%. Transcriptome assembly identified 151,453 transcripts spanning 20,442 loci. Among these, 12.7% of transcripts exactly matched annotated genes in the cucumber reference genome. More than 80% of the transcripts were classified as novel isoforms. Approximately 96.6% of these isoforms originated from known gene loci, while around 3.3% were derived from novel gene loci. Coding potential prediction identified 4,543 long non-coding RNAs (lncRNAs) across 3,376 loci. Building on these results, we identified tissue-specific transcripts in 10 tissues. Among that, 1,655 annotated genes and 4,214 predicted transcripts were considered as tissue-specific. The root exhibited the highest number of tissue-specific transcripts, followed by shoot apex. Subsequent selective pressure analysis revealed that tissue-specific regions experienced stronger directional selection compared to non-specific regions. Conclusions By analyzing thousands of published transcriptome data, we identified abundant transcriptional events and tissue-specific transcripts in cucumbers. This study presented here adds the great value to the public data and offers insights for further exploration of a more comprehensive tissue regulatory network in cucumber.
ISSN:1471-2164