Applying interpretable machine learning to assess intraspecific trait divergence under landscape‐scale population differentiation

Abstract Premise Here we demonstrate the application of interpretable machine learning methods to investigate intraspecific functional trait divergence using diverse genotypes of the wide‐ranging sunflower Helianthus annuus occupying populations across two contrasting ecoregions—the Great Plains ver...

Full description

Saved in:
Bibliographic Details
Main Authors: Sambadi Majumder, Chase M. Mason
Format: Article
Language:English
Published: Wiley 2025-05-01
Series:Applications in Plant Sciences
Subjects:
Online Access:https://doi.org/10.1002/aps3.70015
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Premise Here we demonstrate the application of interpretable machine learning methods to investigate intraspecific functional trait divergence using diverse genotypes of the wide‐ranging sunflower Helianthus annuus occupying populations across two contrasting ecoregions—the Great Plains versus the North American Deserts. Methods Recursive feature elimination was applied to functional trait data from the HeliantHOME database, followed by the application of the Boruta algorithm to detect the traits that are most predictive of ecoregion. Random forest and gradient boosting machine classifiers were then trained and validated, with results visualized using accumulated local effects plots. Results The most ecoregion‐predictive functional traits span categories of leaf economics, plant architecture, reproductive phenology, and floral and seed morphology. Relative to the Great Plains, genotypes from the North American Deserts exhibit shorter stature, fewer leaves, higher leaf nitrogen content, and longer average length of phyllaries. Discussion This approach readily identifies traits predictive of ecoregion origin, and thus the functional traits most likely to be responsible for contrasting ecological strategies across the landscape. This type of approach can be used to parse large plant trait datasets in a wide range of contexts, including explicitly testing the applicability of interspecific paradigms at intraspecific scales.
ISSN:2168-0450