Integration of machine learning and genome‐wide association study to explore the genomic prediction accuracy of agronomic trait in oats (Avena sativa L.)

Abstract Machine learning (ML) has garnered significant attention for its potential to enhance the accuracy of genomic predictions (GPs) in various economic crops with the use of complete genomic information. Genome‐wide association studies (GWAS) are widely used to pinpoint trait‐related causal var...

Full description

Saved in:
Bibliographic Details
Main Authors: Jinghan Peng, Xiong Lei, Tianqi Liu, Yi Xiong, Jiqiang Wu, Yanli Xiong, Minghong You, Junming Zhao, Jian Zhang, Xiao Ma
Format: Article
Language:English
Published: Wiley 2025-03-01
Series:The Plant Genome
Online Access:https://doi.org/10.1002/tpg2.20549
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850048112674996224
author Jinghan Peng
Xiong Lei
Tianqi Liu
Yi Xiong
Jiqiang Wu
Yanli Xiong
Minghong You
Junming Zhao
Jian Zhang
Xiao Ma
author_facet Jinghan Peng
Xiong Lei
Tianqi Liu
Yi Xiong
Jiqiang Wu
Yanli Xiong
Minghong You
Junming Zhao
Jian Zhang
Xiao Ma
author_sort Jinghan Peng
collection DOAJ
description Abstract Machine learning (ML) has garnered significant attention for its potential to enhance the accuracy of genomic predictions (GPs) in various economic crops with the use of complete genomic information. Genome‐wide association studies (GWAS) are widely used to pinpoint trait‐related causal variant loci in genomes. However, the simultaneous integration of both methods for crop genome prediction necessitates further research. In this study, we integrated ML and GWAS to assess the efficiency of GP for seven key agronomic traits in 195 oat (Avena sativa) cultivars from major oat‐growing regions around the world. A total of 94 trait‐associated single nucleotide polymorphisms were identified through the GWAS study. GP studies were conducted using the classical model genomic best linear unbiased prediction (GBLUP) and six ML models. GBLUP performed poorly in predicting all traits except flag leaf width, while none of the ML models consistently provided the best prediction accuracy across all traits. The prediction accuracy of the GWAS‐derived markers was better than that of the use of genome‐wide markers, and plant height had the highest prediction rate at 100 GWAS‐derived markers, and the rest of the traits for which more markers were required. These results play an important role in advancing the use of GP in small oat breeding programs by optimizing the prediction rate of GP and reducing the number of markers, confirming that high prediction rates can be achieved with smaller datasets.
format Article
id doaj-art-d6dbd18fc1f84a268a1fb7a0eb952f79
institution DOAJ
issn 1940-3372
language English
publishDate 2025-03-01
publisher Wiley
record_format Article
series The Plant Genome
spelling doaj-art-d6dbd18fc1f84a268a1fb7a0eb952f792025-08-20T02:54:03ZengWileyThe Plant Genome1940-33722025-03-01181n/an/a10.1002/tpg2.20549Integration of machine learning and genome‐wide association study to explore the genomic prediction accuracy of agronomic trait in oats (Avena sativa L.)Jinghan Peng0Xiong Lei1Tianqi Liu2Yi Xiong3Jiqiang Wu4Yanli Xiong5Minghong You6Junming Zhao7Jian Zhang8Xiao Ma9College of Grassland Science and Technology Sichuan Agricultural University Chengdu ChinaSichuan Academy of Grassland Science Chengdu ChinaCollege of Grassland Science and Technology Sichuan Agricultural University Chengdu ChinaCollege of Grassland Science and Technology Sichuan Agricultural University Chengdu ChinaCollege of Grassland Science and Technology Sichuan Agricultural University Chengdu ChinaCollege of Grassland Science and Technology Sichuan Agricultural University Chengdu ChinaSichuan Academy of Grassland Science Chengdu ChinaCollege of Grassland Science and Technology Sichuan Agricultural University Chengdu ChinaSichuan Provincial Research Center for Forestry and Grassland Development Chengdu ChinaCollege of Grassland Science and Technology Sichuan Agricultural University Chengdu ChinaAbstract Machine learning (ML) has garnered significant attention for its potential to enhance the accuracy of genomic predictions (GPs) in various economic crops with the use of complete genomic information. Genome‐wide association studies (GWAS) are widely used to pinpoint trait‐related causal variant loci in genomes. However, the simultaneous integration of both methods for crop genome prediction necessitates further research. In this study, we integrated ML and GWAS to assess the efficiency of GP for seven key agronomic traits in 195 oat (Avena sativa) cultivars from major oat‐growing regions around the world. A total of 94 trait‐associated single nucleotide polymorphisms were identified through the GWAS study. GP studies were conducted using the classical model genomic best linear unbiased prediction (GBLUP) and six ML models. GBLUP performed poorly in predicting all traits except flag leaf width, while none of the ML models consistently provided the best prediction accuracy across all traits. The prediction accuracy of the GWAS‐derived markers was better than that of the use of genome‐wide markers, and plant height had the highest prediction rate at 100 GWAS‐derived markers, and the rest of the traits for which more markers were required. These results play an important role in advancing the use of GP in small oat breeding programs by optimizing the prediction rate of GP and reducing the number of markers, confirming that high prediction rates can be achieved with smaller datasets.https://doi.org/10.1002/tpg2.20549
spellingShingle Jinghan Peng
Xiong Lei
Tianqi Liu
Yi Xiong
Jiqiang Wu
Yanli Xiong
Minghong You
Junming Zhao
Jian Zhang
Xiao Ma
Integration of machine learning and genome‐wide association study to explore the genomic prediction accuracy of agronomic trait in oats (Avena sativa L.)
The Plant Genome
title Integration of machine learning and genome‐wide association study to explore the genomic prediction accuracy of agronomic trait in oats (Avena sativa L.)
title_full Integration of machine learning and genome‐wide association study to explore the genomic prediction accuracy of agronomic trait in oats (Avena sativa L.)
title_fullStr Integration of machine learning and genome‐wide association study to explore the genomic prediction accuracy of agronomic trait in oats (Avena sativa L.)
title_full_unstemmed Integration of machine learning and genome‐wide association study to explore the genomic prediction accuracy of agronomic trait in oats (Avena sativa L.)
title_short Integration of machine learning and genome‐wide association study to explore the genomic prediction accuracy of agronomic trait in oats (Avena sativa L.)
title_sort integration of machine learning and genome wide association study to explore the genomic prediction accuracy of agronomic trait in oats avena sativa l
url https://doi.org/10.1002/tpg2.20549
work_keys_str_mv AT jinghanpeng integrationofmachinelearningandgenomewideassociationstudytoexplorethegenomicpredictionaccuracyofagronomictraitinoatsavenasatival
AT xionglei integrationofmachinelearningandgenomewideassociationstudytoexplorethegenomicpredictionaccuracyofagronomictraitinoatsavenasatival
AT tianqiliu integrationofmachinelearningandgenomewideassociationstudytoexplorethegenomicpredictionaccuracyofagronomictraitinoatsavenasatival
AT yixiong integrationofmachinelearningandgenomewideassociationstudytoexplorethegenomicpredictionaccuracyofagronomictraitinoatsavenasatival
AT jiqiangwu integrationofmachinelearningandgenomewideassociationstudytoexplorethegenomicpredictionaccuracyofagronomictraitinoatsavenasatival
AT yanlixiong integrationofmachinelearningandgenomewideassociationstudytoexplorethegenomicpredictionaccuracyofagronomictraitinoatsavenasatival
AT minghongyou integrationofmachinelearningandgenomewideassociationstudytoexplorethegenomicpredictionaccuracyofagronomictraitinoatsavenasatival
AT junmingzhao integrationofmachinelearningandgenomewideassociationstudytoexplorethegenomicpredictionaccuracyofagronomictraitinoatsavenasatival
AT jianzhang integrationofmachinelearningandgenomewideassociationstudytoexplorethegenomicpredictionaccuracyofagronomictraitinoatsavenasatival
AT xiaoma integrationofmachinelearningandgenomewideassociationstudytoexplorethegenomicpredictionaccuracyofagronomictraitinoatsavenasatival