Predicting crop yield lows through the highs via binned deep imbalanced regression: A case study on vineyards

Crop yield estimation is vital for agricultural management but often struggles with predicting extreme values that can significantly impact operations and markets. Traditional models face challenges with these extremes, leading to biased and inaccurate predictions. To address this challenge, our stu...

Full description

Saved in:
Bibliographic Details
Main Authors: Hamid Kamangir, Brent S. Sams, Nick Dokoozlian, Luis Sanchez, J. Mason Earles
Format: Article
Language:English
Published: Elsevier 2025-05-01
Series:International Journal of Applied Earth Observations and Geoinformation
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1569843225001839
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Crop yield estimation is vital for agricultural management but often struggles with predicting extreme values that can significantly impact operations and markets. Traditional models face challenges with these extremes, leading to biased and inaccurate predictions. To address this challenge, our study introduces two innovative strategies. First, we propose a cost-sensitive loss function, ExtremeLoss, designed to better capture and represent less frequent yield values by giving greater importance to extreme cases during training. Second, we develop a conditional deep learning model that enhances feature representation by conditioning on a binned yield observation map. This approach encourages smoother and more coherent input feature maps across different segments of the yield value range by leveraging similarities within and across yield bins, ultimately improving the model’s ability to generalize and distinguish between subtle variations in yield. This approach creates ”yield zone maps,” grouping yields into classes (e.g., low extreme, common, high extreme) to improve the identification of yield variability, which can be removed during inference. Our model was tested on a comprehensive grape yield dataset from 2016 to 2019, covering 2,200 hectares and 42 blocks of eight cultivars. We compared its performance against advanced techniques such as Focal-R loss, label distribution smoothing, dense weighting, and class-balanced methods under two validation scenarios: block-hold-out (BHO) and year-block-hold-out (YBHO). Our approach outperforms existing models in R-squared (R2), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE). Notably, it reduces MAE by +2.98 and +14.45 (t/ha) for low and high extremes in the BHO scenario and by +7.18 and +11.05 (t/ha) in the YBHO scenario. It also significantly decreases MAPE by +19.09% and +23.94% in the BHO scenario and by +33.76% and +19.61% in the YBHO scenario. Our model shows a marked improvement in capturing spatial variability and significantly advances spatio-temporal yield estimation, particularly for extreme values in complex agricultural settings like vineyards.
ISSN:1569-8432