Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices?

Accurate and reliable crop yield prediction is essential for optimizing agricultural management, resource allocation, and decision-making, while also supporting farmers and stakeholders in adapting to climate change and increasing global demand. This study introduces an innovative approach to crop y...

Full description

Saved in:
Bibliographic Details
Main Authors: Efrain Noa-Yarasca, Javier M. Osorio Leyton, Chad B. Hajda, Kabindra Adhikari, Douglas R. Smith
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:AI
Subjects:
Online Access:https://www.mdpi.com/2673-2688/6/3/58
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850204641832206336
author Efrain Noa-Yarasca
Javier M. Osorio Leyton
Chad B. Hajda
Kabindra Adhikari
Douglas R. Smith
author_facet Efrain Noa-Yarasca
Javier M. Osorio Leyton
Chad B. Hajda
Kabindra Adhikari
Douglas R. Smith
author_sort Efrain Noa-Yarasca
collection DOAJ
description Accurate and reliable crop yield prediction is essential for optimizing agricultural management, resource allocation, and decision-making, while also supporting farmers and stakeholders in adapting to climate change and increasing global demand. This study introduces an innovative approach to crop yield prediction by incorporating spatially lagged spectral data (SLSD) through the spatial-lagged machine learning (SLML) model, an enhanced version of the spatial lag X (SLX) model. The research aims to show that SLSD improves prediction compared to traditional vegetation index (VI)-based methods. Conducted on a 19-hectare cornfield at the ARS Grassland, Soil, and Water Research Laboratory during the 2023 growing season, this study used five-band multispectral image data and 8581 yield measurements ranging from 1.69 to 15.86 Mg/Ha. Four predictor sets were evaluated: Set 1 (spectral bands), Set 2 (spectral bands + neighborhood data), Set 3 (spectral bands + VIs), and Set 4 (spectral bands + top VIs + neighborhood data). These were evaluated using the SLX model and four decision-tree-based SLML models (RF, XGB, ET, GBR), with performance assessed using R<sup>2</sup> and RMSE. Results showed that incorporating spatial neighborhood data (Set 2) outperformed VI-based approaches (Set 3), emphasizing the importance of spatial context. SLML models, particularly XGB, RF, and ET, performed best with 4–8 neighbors, while excessive neighbors slightly reduced accuracy. In Set 3, VIs improved predictions, but a smaller subset (10–15 indices) was sufficient for optimal yield prediction. Set 4 showed slight gains over Sets 2 and 3, with XGB and RF achieving the highest R<sup>2</sup> values. Key predictors included spatially lagged spectral bands (e.g., Green_lag, NIR_lag, RedEdge_lag) and VIs (e.g., CREI, GCI, NCPI, ARI, CCCI), highlighting the value of integrating neighborhood data for improved corn yield prediction. This study underscores the importance of spatial context in corn yield prediction and lays the foundation for future research across diverse agricultural settings, focusing on optimizing neighborhood size, integrating spatial and spectral data, and refining spatial dependencies through localized search algorithms.
format Article
id doaj-art-c63fbd7dc210411b9394bea8c95d50aa
institution OA Journals
issn 2673-2688
language English
publishDate 2025-03-01
publisher MDPI AG
record_format Article
series AI
spelling doaj-art-c63fbd7dc210411b9394bea8c95d50aa2025-08-20T02:11:15ZengMDPI AGAI2673-26882025-03-01635810.3390/ai6030058Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices?Efrain Noa-Yarasca0Javier M. Osorio Leyton1Chad B. Hajda2Kabindra Adhikari3Douglas R. Smith4Texas A&M AgriLife Research, Blackland Research and Extension Center, Temple, TX 76502, USATexas A&M AgriLife Research, Blackland Research and Extension Center, Temple, TX 76502, USAGrassland Soil and Water Research Laboratory, United States Department of Agriculture–Agriculture Research Service, Temple, TX 76502, USAGrassland Soil and Water Research Laboratory, United States Department of Agriculture–Agriculture Research Service, Temple, TX 76502, USAGrassland Soil and Water Research Laboratory, United States Department of Agriculture–Agriculture Research Service, Temple, TX 76502, USAAccurate and reliable crop yield prediction is essential for optimizing agricultural management, resource allocation, and decision-making, while also supporting farmers and stakeholders in adapting to climate change and increasing global demand. This study introduces an innovative approach to crop yield prediction by incorporating spatially lagged spectral data (SLSD) through the spatial-lagged machine learning (SLML) model, an enhanced version of the spatial lag X (SLX) model. The research aims to show that SLSD improves prediction compared to traditional vegetation index (VI)-based methods. Conducted on a 19-hectare cornfield at the ARS Grassland, Soil, and Water Research Laboratory during the 2023 growing season, this study used five-band multispectral image data and 8581 yield measurements ranging from 1.69 to 15.86 Mg/Ha. Four predictor sets were evaluated: Set 1 (spectral bands), Set 2 (spectral bands + neighborhood data), Set 3 (spectral bands + VIs), and Set 4 (spectral bands + top VIs + neighborhood data). These were evaluated using the SLX model and four decision-tree-based SLML models (RF, XGB, ET, GBR), with performance assessed using R<sup>2</sup> and RMSE. Results showed that incorporating spatial neighborhood data (Set 2) outperformed VI-based approaches (Set 3), emphasizing the importance of spatial context. SLML models, particularly XGB, RF, and ET, performed best with 4–8 neighbors, while excessive neighbors slightly reduced accuracy. In Set 3, VIs improved predictions, but a smaller subset (10–15 indices) was sufficient for optimal yield prediction. Set 4 showed slight gains over Sets 2 and 3, with XGB and RF achieving the highest R<sup>2</sup> values. Key predictors included spatially lagged spectral bands (e.g., Green_lag, NIR_lag, RedEdge_lag) and VIs (e.g., CREI, GCI, NCPI, ARI, CCCI), highlighting the value of integrating neighborhood data for improved corn yield prediction. This study underscores the importance of spatial context in corn yield prediction and lays the foundation for future research across diverse agricultural settings, focusing on optimizing neighborhood size, integrating spatial and spectral data, and refining spatial dependencies through localized search algorithms.https://www.mdpi.com/2673-2688/6/3/58corn yield predictionspatial-lagged machine learning modelspectral neighborhood datavegetation indices (VIs)spatial autocorrelation
spellingShingle Efrain Noa-Yarasca
Javier M. Osorio Leyton
Chad B. Hajda
Kabindra Adhikari
Douglas R. Smith
Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices?
AI
corn yield prediction
spatial-lagged machine learning model
spectral neighborhood data
vegetation indices (VIs)
spatial autocorrelation
title Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices?
title_full Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices?
title_fullStr Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices?
title_full_unstemmed Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices?
title_short Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices?
title_sort leveraging spectral neighborhood information for corn yield prediction with spatial lagged machine learning modeling can neighborhood information outperform vegetation indices
topic corn yield prediction
spatial-lagged machine learning model
spectral neighborhood data
vegetation indices (VIs)
spatial autocorrelation
url https://www.mdpi.com/2673-2688/6/3/58
work_keys_str_mv AT efrainnoayarasca leveragingspectralneighborhoodinformationforcornyieldpredictionwithspatiallaggedmachinelearningmodelingcanneighborhoodinformationoutperformvegetationindices
AT javiermosorioleyton leveragingspectralneighborhoodinformationforcornyieldpredictionwithspatiallaggedmachinelearningmodelingcanneighborhoodinformationoutperformvegetationindices
AT chadbhajda leveragingspectralneighborhoodinformationforcornyieldpredictionwithspatiallaggedmachinelearningmodelingcanneighborhoodinformationoutperformvegetationindices
AT kabindraadhikari leveragingspectralneighborhoodinformationforcornyieldpredictionwithspatiallaggedmachinelearningmodelingcanneighborhoodinformationoutperformvegetationindices
AT douglasrsmith leveragingspectralneighborhoodinformationforcornyieldpredictionwithspatiallaggedmachinelearningmodelingcanneighborhoodinformationoutperformvegetationindices