Evaluating Genome Assemblies for Optimized Completeness and Accuracy of Reference Gene Sequences in Wheat, Rye, and Triticale

Recent years have witnessed a surge in the publication of dozens of genome assemblies for Triticeae crops, which have significantly advanced gene-related research in wheat, rye, and triticale. However, this progress has also introduced challenges in selecting universally efficient and applicable ref...

Full description

Saved in:
Bibliographic Details
Main Authors: Mingke Yan, Guodong Yang, Dongming Yang, Xin Zhang, Quanzhen Wang, Jinghui Gao, Chugang Mei
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Plants
Subjects:
Online Access:https://www.mdpi.com/2223-7747/14/7/1140
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recent years have witnessed a surge in the publication of dozens of genome assemblies for Triticeae crops, which have significantly advanced gene-related research in wheat, rye, and triticale. However, this progress has also introduced challenges in selecting universally efficient and applicable reference genomes for genotypes with distant or ambiguous phylogenetic relationships. In this study, we assessed the completeness and accuracy of genome assemblies for wheat, rye, and triticale using comparative benchmarking universal single-copy orthologue (BUSCO) analysis and transcript mapping approaches. BUSCO analysis revealed that the proportion of complete genes positively correlated with RNA-seq read mappability, while the frequency of internal stop codons served as a significant negative indicator of assembly accuracy and RNA-seq data mappability in wheat. By integrated analysis of alignment rate, covered length, and total depth from RNA-seq data, we identified the assemblies of SY Mattis, Lo7, and SY Mattis plus Lo7 as the most robust references for gene-related studies in wheat, rye, and triticale, respectively. Furthermore, we recommend that the D genome sequence be incorporated in reference assemblies in bioinformatic analyses for triticale, as introgression, translocation, and substitution of the D genome into triticale genome frequently occurs during triticale breeding. The frequency of internal stop codons could help in evaluating correctness of assemblies published in the future, and other findings are expected to support gene-related research in wheat, rye, triticale, and other closely related species.
ISSN:2223-7747