Optimizing Cost-Effective gene expression phenotyping approaches in cattle using 3′ mRNA sequencing

Abstract Background Genetic and genomic selection programs require large numbers of phenotypes observed for animals in shared environments. Direct measurements of phenotypes like meat quality, methane emission, and disease susceptibility are difficult and expensive to measure at scale but are critic...

Full description

Saved in:
Bibliographic Details
Main Authors: Ruwaa I. Mohamed, Taylor B. Ault-Seay, Sonia J. Moisá, Jonathan E. Beever, Agustín G. Ríus, Troy N. Rowan
Format: Article
Language:English
Published: BMC 2025-04-01
Series:BMC Genomics
Subjects:
Online Access:https://doi.org/10.1186/s12864-025-11571-4
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background Genetic and genomic selection programs require large numbers of phenotypes observed for animals in shared environments. Direct measurements of phenotypes like meat quality, methane emission, and disease susceptibility are difficult and expensive to measure at scale but are critically important to livestock production. Our work leans on our understanding of the “Central Dogma” of molecular genetics to leverage molecular intermediates as cheaply-measured proxies of organism-level phenotypes. The rapidly declining cost of next-generation sequencing presents opportunities for population-level molecular phenotyping. While the cost of whole transcriptome sequencing has declined recently, its required sequencing depth still makes it an expensive choice for wide-scale molecular phenotyping. We aim to optimize 3′ mRNA sequencing (3′ mRNA-Seq) approaches for collecting cost-effective proxy molecular phenotypes for cattle from easy-to-collect tissue samples (i.e., whole blood). We used matched 3′ mRNA-Seq samples for 15 Holstein male calves in a heat stress trail to identify the (1) best library preparation kit (Takara SMART-Seq v4 3′ DE and Lexogen QuantSeq) and (2) optimal sequencing depth (0.5 to 20 million reads/sample) to capture gene expression phenotypes most cost-effectively. Results Takara SMART-Seq v4 3′ DE outperformed Lexogen QuantSeq libraries across all metrics: number of quality reads, expressed genes, informative genes, differentially expressed genes, and 3′ biased intragenic variants. Serial downsampling analyses identified that as few as 8.0 million reads per sample could effectively capture most of the between-sample variation in gene expression. However, progressively more reads did provide marginal increases in recall across metrics. These 3′ mRNA-Seq reads can also capture animal genotypes that could be used as the basis for downstream imputation. The 10 million read downsampled groups called an average of 109,700 SNPs and 11,367 INDELs, many of which segregate at moderate minor allele frequencies in the population. Conclusion This work demonstrates that 3′ mRNA-Seq with Takara SMART-Seq v4 3′ DE can provide an incredibly cost-effective (< 25 USD/sample) approach to quantifying molecular phenotypes (gene expression) while discovering sufficient variation for use in genotype imputation. Ongoing work is evaluating the accuracy of imputation and the ability of much larger datasets to predict individual animal phenotypes.
ISSN:1471-2164