Deep learning and genomic best linear unbiased prediction integration: An approach to identify potential nonlinear genetic relationships between traits

ABSTRACT: Genomic prediction (GP) aims to predict the breeding values of multiple complex traits, usually assumed to be multivariate normally distributed by the largely used statistical methods, thus imposing linear genetic relationships between traits. Although these methods are valuable for GP, th...

Full description

Saved in:
Bibliographic Details
Main Authors: F. Shokor, P. Croiseau, H. Gangloff, R. Saintilan, T. Tribout, T. Mary-Huard, B.C.D. Cuyabano
Format: Article
Language:English
Published: Elsevier 2025-06-01
Series:Journal of Dairy Science
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S0022030225002607
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849328692999749632
author F. Shokor
P. Croiseau
H. Gangloff
R. Saintilan
T. Tribout
T. Mary-Huard
B.C.D. Cuyabano
author_facet F. Shokor
P. Croiseau
H. Gangloff
R. Saintilan
T. Tribout
T. Mary-Huard
B.C.D. Cuyabano
author_sort F. Shokor
collection DOAJ
description ABSTRACT: Genomic prediction (GP) aims to predict the breeding values of multiple complex traits, usually assumed to be multivariate normally distributed by the largely used statistical methods, thus imposing linear genetic relationships between traits. Although these methods are valuable for GP, they do not account for potential nonlinear genetic relationships between traits in scenarios. For individual traits, this oversight may minimally affect prediction accuracy, but it can limit genetic progress when selection involves multiple traits. Deep learning (DL) offers a promising alternative for capturing nonlinear genetic relationships due to its ability to identify complex patterns without prior assumptions about the data structure. We proposed a novel hybrid model that that combines both DL and GBLUP (DLGBLUP), which uses the output of the traditional GBLUP, and enhances its predicted genetic values (PGV) by accounting for nonlinear genetic relationships between traits using DL. We simulated data with linear and nonlinear genetic relationships between traits in order to verify whether DLGBLUP was able to identify nonlinearity when present and avoid inducing it when absent. We found that DLGBLUP consistently provided more accurate PGV for traits simulated with strong nonlinear genetic relationships, accurately identifying these relationships. Over 7 generations of selection, a greater genetic progress was achieved with PGV that accounted for nonlinear relationships (DLGBLUP), compared with GBLUP. When applied to a real dataset from the French Holstein dairy cattle population, DLGBLUP detected nonlinear genetic relationships between pairs of traits, such as conception rate and protein content, and SCC and fat yield, although, no significant increase in prediction accuracy was observed. The integration of DL into GP enabled the modeling of nonlinear genetic relationships between traits, a possibility not previously discussed, given the linear nature of GBLUP. The detection of nonlinear genetic relationships between traits in the French Holstein population when using DLGBLUP indicates the presence of such relationships in real breeding data, suggesting that it may be relevant to further explore nonlinear relationships. This possibility of nonlinear genetic relationships between traits offers a different perspective into multitrait evaluations, with potential to further improve selection strategies in commercial livestock breeding programs. This is particularly relevant when integrating new traits into multitrait evaluations or incorporating new subpopulations, which may introduce different forms of nonlinearity. Finally, it is shown that DL can be used as a complement to the statistical methods deployed in routine genetic evaluations, rather than as an alternative, by enhancing their performance.
format Article
id doaj-art-a66208a6d6f44d029f976a761b6f4eaa
institution Kabale University
issn 0022-0302
language English
publishDate 2025-06-01
publisher Elsevier
record_format Article
series Journal of Dairy Science
spelling doaj-art-a66208a6d6f44d029f976a761b6f4eaa2025-08-20T03:47:32ZengElsevierJournal of Dairy Science0022-03022025-06-0110866174618910.3168/jds.2024-26057Deep learning and genomic best linear unbiased prediction integration: An approach to identify potential nonlinear genetic relationships between traitsF. Shokor0P. Croiseau1H. Gangloff2R. Saintilan3T. Tribout4T. Mary-Huard5B.C.D. Cuyabano6Eliance, 75012 Paris, France; Université Paris-Saclay, INRAE, AgroParisTech, GABI, 78350 Jouy-en-Josas, France; Corresponding authorUniversité Paris-Saclay, INRAE, AgroParisTech, GABI, 78350 Jouy-en-Josas, FranceUniversité Paris-Saclay, INRAE, AgroParisTech, UMR MIA Paris-Saclay, 91120 Palaiseau, FranceEliance, 75012 Paris, France; Université Paris-Saclay, INRAE, AgroParisTech, GABI, 78350 Jouy-en-Josas, FranceUniversité Paris-Saclay, INRAE, AgroParisTech, GABI, 78350 Jouy-en-Josas, FranceUniversité Paris-Saclay, INRAE, AgroParisTech, UMR MIA Paris-Saclay, 91120 Palaiseau, France; Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE–Le Moulon, 91190 Gif-sur-Yvette, FranceUniversité Paris-Saclay, INRAE, AgroParisTech, GABI, 78350 Jouy-en-Josas, FranceABSTRACT: Genomic prediction (GP) aims to predict the breeding values of multiple complex traits, usually assumed to be multivariate normally distributed by the largely used statistical methods, thus imposing linear genetic relationships between traits. Although these methods are valuable for GP, they do not account for potential nonlinear genetic relationships between traits in scenarios. For individual traits, this oversight may minimally affect prediction accuracy, but it can limit genetic progress when selection involves multiple traits. Deep learning (DL) offers a promising alternative for capturing nonlinear genetic relationships due to its ability to identify complex patterns without prior assumptions about the data structure. We proposed a novel hybrid model that that combines both DL and GBLUP (DLGBLUP), which uses the output of the traditional GBLUP, and enhances its predicted genetic values (PGV) by accounting for nonlinear genetic relationships between traits using DL. We simulated data with linear and nonlinear genetic relationships between traits in order to verify whether DLGBLUP was able to identify nonlinearity when present and avoid inducing it when absent. We found that DLGBLUP consistently provided more accurate PGV for traits simulated with strong nonlinear genetic relationships, accurately identifying these relationships. Over 7 generations of selection, a greater genetic progress was achieved with PGV that accounted for nonlinear relationships (DLGBLUP), compared with GBLUP. When applied to a real dataset from the French Holstein dairy cattle population, DLGBLUP detected nonlinear genetic relationships between pairs of traits, such as conception rate and protein content, and SCC and fat yield, although, no significant increase in prediction accuracy was observed. The integration of DL into GP enabled the modeling of nonlinear genetic relationships between traits, a possibility not previously discussed, given the linear nature of GBLUP. The detection of nonlinear genetic relationships between traits in the French Holstein population when using DLGBLUP indicates the presence of such relationships in real breeding data, suggesting that it may be relevant to further explore nonlinear relationships. This possibility of nonlinear genetic relationships between traits offers a different perspective into multitrait evaluations, with potential to further improve selection strategies in commercial livestock breeding programs. This is particularly relevant when integrating new traits into multitrait evaluations or incorporating new subpopulations, which may introduce different forms of nonlinearity. Finally, it is shown that DL can be used as a complement to the statistical methods deployed in routine genetic evaluations, rather than as an alternative, by enhancing their performance.http://www.sciencedirect.com/science/article/pii/S0022030225002607machine learningGBLUPmultitrait modelsgenetic evaluationgenetic relationship
spellingShingle F. Shokor
P. Croiseau
H. Gangloff
R. Saintilan
T. Tribout
T. Mary-Huard
B.C.D. Cuyabano
Deep learning and genomic best linear unbiased prediction integration: An approach to identify potential nonlinear genetic relationships between traits
Journal of Dairy Science
machine learning
GBLUP
multitrait models
genetic evaluation
genetic relationship
title Deep learning and genomic best linear unbiased prediction integration: An approach to identify potential nonlinear genetic relationships between traits
title_full Deep learning and genomic best linear unbiased prediction integration: An approach to identify potential nonlinear genetic relationships between traits
title_fullStr Deep learning and genomic best linear unbiased prediction integration: An approach to identify potential nonlinear genetic relationships between traits
title_full_unstemmed Deep learning and genomic best linear unbiased prediction integration: An approach to identify potential nonlinear genetic relationships between traits
title_short Deep learning and genomic best linear unbiased prediction integration: An approach to identify potential nonlinear genetic relationships between traits
title_sort deep learning and genomic best linear unbiased prediction integration an approach to identify potential nonlinear genetic relationships between traits
topic machine learning
GBLUP
multitrait models
genetic evaluation
genetic relationship
url http://www.sciencedirect.com/science/article/pii/S0022030225002607
work_keys_str_mv AT fshokor deeplearningandgenomicbestlinearunbiasedpredictionintegrationanapproachtoidentifypotentialnonlineargeneticrelationshipsbetweentraits
AT pcroiseau deeplearningandgenomicbestlinearunbiasedpredictionintegrationanapproachtoidentifypotentialnonlineargeneticrelationshipsbetweentraits
AT hgangloff deeplearningandgenomicbestlinearunbiasedpredictionintegrationanapproachtoidentifypotentialnonlineargeneticrelationshipsbetweentraits
AT rsaintilan deeplearningandgenomicbestlinearunbiasedpredictionintegrationanapproachtoidentifypotentialnonlineargeneticrelationshipsbetweentraits
AT ttribout deeplearningandgenomicbestlinearunbiasedpredictionintegrationanapproachtoidentifypotentialnonlineargeneticrelationshipsbetweentraits
AT tmaryhuard deeplearningandgenomicbestlinearunbiasedpredictionintegrationanapproachtoidentifypotentialnonlineargeneticrelationshipsbetweentraits
AT bcdcuyabano deeplearningandgenomicbestlinearunbiasedpredictionintegrationanapproachtoidentifypotentialnonlineargeneticrelationshipsbetweentraits