Advanced machine learning for regional potato yield prediction: analysis of essential drivers

Abstract Localized yield prediction is critical for farmers and policymakers, supporting sustainability, food security, and climate change adaptation. This research evaluates machine learning models, including Random Forest and Gradient Boosting, for predicting crop yields. These models can be adapt...

Full description

Saved in:
Bibliographic Details
Main Authors: Dania Tamayo-Vera, Morteza Mesbah, Yinsuo Zhang, Xiuquan Wang
Format: Article
Language:English
Published: Nature Portfolio 2025-03-01
Series:npj Sustainable Agriculture
Online Access:https://doi.org/10.1038/s44264-025-00052-6
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Localized yield prediction is critical for farmers and policymakers, supporting sustainability, food security, and climate change adaptation. This research evaluates machine learning models, including Random Forest and Gradient Boosting, for predicting crop yields. These models can be adapted for in-season yield forecasting, providing predictions as early as one month before harvest. The study applied models to postal code-level yield data from 1982 to 2016, incorporating daily climate data, agroclimatic indices, soil parameters, and earth observation NDVI data for Prince Edward Island (PEI), Canada. SHapley Additive exPlanations (SHAP) values identified temperature variables and NDVI as significant predictors. The study highlighted rainfall and soil water retention’s importance for irrigation strategies. Random Forest achieved an RMSE of 0.011 (t/ac), 0.6 (t/ac) less than the best linear regression model. This precision translates to $81,600 CAD per farm annually in PEI, supporting economic and environmental benefits through improved planning and land management.
ISSN:2731-9202