Predicting crop yield lows through the highs via binned deep imbalanced regression: A case study on vineyards

Crop yield estimation is vital for agricultural management but often struggles with predicting extreme values that can significantly impact operations and markets. Traditional models face challenges with these extremes, leading to biased and inaccurate predictions. To address this challenge, our stu...

Full description

Saved in:
Bibliographic Details
Main Authors: Hamid Kamangir, Brent S. Sams, Nick Dokoozlian, Luis Sanchez, J. Mason Earles
Format: Article
Language:English
Published: Elsevier 2025-05-01
Series:International Journal of Applied Earth Observations and Geoinformation
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1569843225001839
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850032951017865216
author Hamid Kamangir
Brent S. Sams
Nick Dokoozlian
Luis Sanchez
J. Mason Earles
author_facet Hamid Kamangir
Brent S. Sams
Nick Dokoozlian
Luis Sanchez
J. Mason Earles
author_sort Hamid Kamangir
collection DOAJ
description Crop yield estimation is vital for agricultural management but often struggles with predicting extreme values that can significantly impact operations and markets. Traditional models face challenges with these extremes, leading to biased and inaccurate predictions. To address this challenge, our study introduces two innovative strategies. First, we propose a cost-sensitive loss function, ExtremeLoss, designed to better capture and represent less frequent yield values by giving greater importance to extreme cases during training. Second, we develop a conditional deep learning model that enhances feature representation by conditioning on a binned yield observation map. This approach encourages smoother and more coherent input feature maps across different segments of the yield value range by leveraging similarities within and across yield bins, ultimately improving the model’s ability to generalize and distinguish between subtle variations in yield. This approach creates ”yield zone maps,” grouping yields into classes (e.g., low extreme, common, high extreme) to improve the identification of yield variability, which can be removed during inference. Our model was tested on a comprehensive grape yield dataset from 2016 to 2019, covering 2,200 hectares and 42 blocks of eight cultivars. We compared its performance against advanced techniques such as Focal-R loss, label distribution smoothing, dense weighting, and class-balanced methods under two validation scenarios: block-hold-out (BHO) and year-block-hold-out (YBHO). Our approach outperforms existing models in R-squared (R2), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE). Notably, it reduces MAE by +2.98 and +14.45 (t/ha) for low and high extremes in the BHO scenario and by +7.18 and +11.05 (t/ha) in the YBHO scenario. It also significantly decreases MAPE by +19.09% and +23.94% in the BHO scenario and by +33.76% and +19.61% in the YBHO scenario. Our model shows a marked improvement in capturing spatial variability and significantly advances spatio-temporal yield estimation, particularly for extreme values in complex agricultural settings like vineyards.
format Article
id doaj-art-ead1f2a871f84fdf98a6e5471cc24be4
institution DOAJ
issn 1569-8432
language English
publishDate 2025-05-01
publisher Elsevier
record_format Article
series International Journal of Applied Earth Observations and Geoinformation
spelling doaj-art-ead1f2a871f84fdf98a6e5471cc24be42025-08-20T02:58:25ZengElsevierInternational Journal of Applied Earth Observations and Geoinformation1569-84322025-05-0113910453610.1016/j.jag.2025.104536Predicting crop yield lows through the highs via binned deep imbalanced regression: A case study on vineyardsHamid Kamangir0Brent S. Sams1Nick Dokoozlian2Luis Sanchez3J. Mason Earles4Department of Biological and Agricultural Engineering, University of California Davis, Davis, CA, USA; Corresponding author.Department of Winegrowing Research, E&J Gallo Winery, Modesto, CA, USADepartment of Winegrowing Research, E&J Gallo Winery, Modesto, CA, USADepartment of Winegrowing Research, E&J Gallo Winery, Modesto, CA, USADepartment of Biological and Agricultural Engineering, University of California Davis, Davis, CA, USA; Department of Viticulture and Enology, University of California Davis, Davis, CA, USACrop yield estimation is vital for agricultural management but often struggles with predicting extreme values that can significantly impact operations and markets. Traditional models face challenges with these extremes, leading to biased and inaccurate predictions. To address this challenge, our study introduces two innovative strategies. First, we propose a cost-sensitive loss function, ExtremeLoss, designed to better capture and represent less frequent yield values by giving greater importance to extreme cases during training. Second, we develop a conditional deep learning model that enhances feature representation by conditioning on a binned yield observation map. This approach encourages smoother and more coherent input feature maps across different segments of the yield value range by leveraging similarities within and across yield bins, ultimately improving the model’s ability to generalize and distinguish between subtle variations in yield. This approach creates ”yield zone maps,” grouping yields into classes (e.g., low extreme, common, high extreme) to improve the identification of yield variability, which can be removed during inference. Our model was tested on a comprehensive grape yield dataset from 2016 to 2019, covering 2,200 hectares and 42 blocks of eight cultivars. We compared its performance against advanced techniques such as Focal-R loss, label distribution smoothing, dense weighting, and class-balanced methods under two validation scenarios: block-hold-out (BHO) and year-block-hold-out (YBHO). Our approach outperforms existing models in R-squared (R2), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE). Notably, it reduces MAE by +2.98 and +14.45 (t/ha) for low and high extremes in the BHO scenario and by +7.18 and +11.05 (t/ha) in the YBHO scenario. It also significantly decreases MAPE by +19.09% and +23.94% in the BHO scenario and by +33.76% and +19.61% in the YBHO scenario. Our model shows a marked improvement in capturing spatial variability and significantly advances spatio-temporal yield estimation, particularly for extreme values in complex agricultural settings like vineyards.http://www.sciencedirect.com/science/article/pii/S1569843225001839Deep imbalanced regressionCost sensitive loss functionBig dataCrop yield estimationSpatio-temporal estimation
spellingShingle Hamid Kamangir
Brent S. Sams
Nick Dokoozlian
Luis Sanchez
J. Mason Earles
Predicting crop yield lows through the highs via binned deep imbalanced regression: A case study on vineyards
International Journal of Applied Earth Observations and Geoinformation
Deep imbalanced regression
Cost sensitive loss function
Big data
Crop yield estimation
Spatio-temporal estimation
title Predicting crop yield lows through the highs via binned deep imbalanced regression: A case study on vineyards
title_full Predicting crop yield lows through the highs via binned deep imbalanced regression: A case study on vineyards
title_fullStr Predicting crop yield lows through the highs via binned deep imbalanced regression: A case study on vineyards
title_full_unstemmed Predicting crop yield lows through the highs via binned deep imbalanced regression: A case study on vineyards
title_short Predicting crop yield lows through the highs via binned deep imbalanced regression: A case study on vineyards
title_sort predicting crop yield lows through the highs via binned deep imbalanced regression a case study on vineyards
topic Deep imbalanced regression
Cost sensitive loss function
Big data
Crop yield estimation
Spatio-temporal estimation
url http://www.sciencedirect.com/science/article/pii/S1569843225001839
work_keys_str_mv AT hamidkamangir predictingcropyieldlowsthroughthehighsviabinneddeepimbalancedregressionacasestudyonvineyards
AT brentssams predictingcropyieldlowsthroughthehighsviabinneddeepimbalancedregressionacasestudyonvineyards
AT nickdokoozlian predictingcropyieldlowsthroughthehighsviabinneddeepimbalancedregressionacasestudyonvineyards
AT luissanchez predictingcropyieldlowsthroughthehighsviabinneddeepimbalancedregressionacasestudyonvineyards
AT jmasonearles predictingcropyieldlowsthroughthehighsviabinneddeepimbalancedregressionacasestudyonvineyards