Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices?
Accurate and reliable crop yield prediction is essential for optimizing agricultural management, resource allocation, and decision-making, while also supporting farmers and stakeholders in adapting to climate change and increasing global demand. This study introduces an innovative approach to crop y...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | AI |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2673-2688/6/3/58 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850204641832206336 |
|---|---|
| author | Efrain Noa-Yarasca Javier M. Osorio Leyton Chad B. Hajda Kabindra Adhikari Douglas R. Smith |
| author_facet | Efrain Noa-Yarasca Javier M. Osorio Leyton Chad B. Hajda Kabindra Adhikari Douglas R. Smith |
| author_sort | Efrain Noa-Yarasca |
| collection | DOAJ |
| description | Accurate and reliable crop yield prediction is essential for optimizing agricultural management, resource allocation, and decision-making, while also supporting farmers and stakeholders in adapting to climate change and increasing global demand. This study introduces an innovative approach to crop yield prediction by incorporating spatially lagged spectral data (SLSD) through the spatial-lagged machine learning (SLML) model, an enhanced version of the spatial lag X (SLX) model. The research aims to show that SLSD improves prediction compared to traditional vegetation index (VI)-based methods. Conducted on a 19-hectare cornfield at the ARS Grassland, Soil, and Water Research Laboratory during the 2023 growing season, this study used five-band multispectral image data and 8581 yield measurements ranging from 1.69 to 15.86 Mg/Ha. Four predictor sets were evaluated: Set 1 (spectral bands), Set 2 (spectral bands + neighborhood data), Set 3 (spectral bands + VIs), and Set 4 (spectral bands + top VIs + neighborhood data). These were evaluated using the SLX model and four decision-tree-based SLML models (RF, XGB, ET, GBR), with performance assessed using R<sup>2</sup> and RMSE. Results showed that incorporating spatial neighborhood data (Set 2) outperformed VI-based approaches (Set 3), emphasizing the importance of spatial context. SLML models, particularly XGB, RF, and ET, performed best with 4–8 neighbors, while excessive neighbors slightly reduced accuracy. In Set 3, VIs improved predictions, but a smaller subset (10–15 indices) was sufficient for optimal yield prediction. Set 4 showed slight gains over Sets 2 and 3, with XGB and RF achieving the highest R<sup>2</sup> values. Key predictors included spatially lagged spectral bands (e.g., Green_lag, NIR_lag, RedEdge_lag) and VIs (e.g., CREI, GCI, NCPI, ARI, CCCI), highlighting the value of integrating neighborhood data for improved corn yield prediction. This study underscores the importance of spatial context in corn yield prediction and lays the foundation for future research across diverse agricultural settings, focusing on optimizing neighborhood size, integrating spatial and spectral data, and refining spatial dependencies through localized search algorithms. |
| format | Article |
| id | doaj-art-c63fbd7dc210411b9394bea8c95d50aa |
| institution | OA Journals |
| issn | 2673-2688 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | AI |
| spelling | doaj-art-c63fbd7dc210411b9394bea8c95d50aa2025-08-20T02:11:15ZengMDPI AGAI2673-26882025-03-01635810.3390/ai6030058Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices?Efrain Noa-Yarasca0Javier M. Osorio Leyton1Chad B. Hajda2Kabindra Adhikari3Douglas R. Smith4Texas A&M AgriLife Research, Blackland Research and Extension Center, Temple, TX 76502, USATexas A&M AgriLife Research, Blackland Research and Extension Center, Temple, TX 76502, USAGrassland Soil and Water Research Laboratory, United States Department of Agriculture–Agriculture Research Service, Temple, TX 76502, USAGrassland Soil and Water Research Laboratory, United States Department of Agriculture–Agriculture Research Service, Temple, TX 76502, USAGrassland Soil and Water Research Laboratory, United States Department of Agriculture–Agriculture Research Service, Temple, TX 76502, USAAccurate and reliable crop yield prediction is essential for optimizing agricultural management, resource allocation, and decision-making, while also supporting farmers and stakeholders in adapting to climate change and increasing global demand. This study introduces an innovative approach to crop yield prediction by incorporating spatially lagged spectral data (SLSD) through the spatial-lagged machine learning (SLML) model, an enhanced version of the spatial lag X (SLX) model. The research aims to show that SLSD improves prediction compared to traditional vegetation index (VI)-based methods. Conducted on a 19-hectare cornfield at the ARS Grassland, Soil, and Water Research Laboratory during the 2023 growing season, this study used five-band multispectral image data and 8581 yield measurements ranging from 1.69 to 15.86 Mg/Ha. Four predictor sets were evaluated: Set 1 (spectral bands), Set 2 (spectral bands + neighborhood data), Set 3 (spectral bands + VIs), and Set 4 (spectral bands + top VIs + neighborhood data). These were evaluated using the SLX model and four decision-tree-based SLML models (RF, XGB, ET, GBR), with performance assessed using R<sup>2</sup> and RMSE. Results showed that incorporating spatial neighborhood data (Set 2) outperformed VI-based approaches (Set 3), emphasizing the importance of spatial context. SLML models, particularly XGB, RF, and ET, performed best with 4–8 neighbors, while excessive neighbors slightly reduced accuracy. In Set 3, VIs improved predictions, but a smaller subset (10–15 indices) was sufficient for optimal yield prediction. Set 4 showed slight gains over Sets 2 and 3, with XGB and RF achieving the highest R<sup>2</sup> values. Key predictors included spatially lagged spectral bands (e.g., Green_lag, NIR_lag, RedEdge_lag) and VIs (e.g., CREI, GCI, NCPI, ARI, CCCI), highlighting the value of integrating neighborhood data for improved corn yield prediction. This study underscores the importance of spatial context in corn yield prediction and lays the foundation for future research across diverse agricultural settings, focusing on optimizing neighborhood size, integrating spatial and spectral data, and refining spatial dependencies through localized search algorithms.https://www.mdpi.com/2673-2688/6/3/58corn yield predictionspatial-lagged machine learning modelspectral neighborhood datavegetation indices (VIs)spatial autocorrelation |
| spellingShingle | Efrain Noa-Yarasca Javier M. Osorio Leyton Chad B. Hajda Kabindra Adhikari Douglas R. Smith Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices? AI corn yield prediction spatial-lagged machine learning model spectral neighborhood data vegetation indices (VIs) spatial autocorrelation |
| title | Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices? |
| title_full | Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices? |
| title_fullStr | Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices? |
| title_full_unstemmed | Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices? |
| title_short | Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices? |
| title_sort | leveraging spectral neighborhood information for corn yield prediction with spatial lagged machine learning modeling can neighborhood information outperform vegetation indices |
| topic | corn yield prediction spatial-lagged machine learning model spectral neighborhood data vegetation indices (VIs) spatial autocorrelation |
| url | https://www.mdpi.com/2673-2688/6/3/58 |
| work_keys_str_mv | AT efrainnoayarasca leveragingspectralneighborhoodinformationforcornyieldpredictionwithspatiallaggedmachinelearningmodelingcanneighborhoodinformationoutperformvegetationindices AT javiermosorioleyton leveragingspectralneighborhoodinformationforcornyieldpredictionwithspatiallaggedmachinelearningmodelingcanneighborhoodinformationoutperformvegetationindices AT chadbhajda leveragingspectralneighborhoodinformationforcornyieldpredictionwithspatiallaggedmachinelearningmodelingcanneighborhoodinformationoutperformvegetationindices AT kabindraadhikari leveragingspectralneighborhoodinformationforcornyieldpredictionwithspatiallaggedmachinelearningmodelingcanneighborhoodinformationoutperformvegetationindices AT douglasrsmith leveragingspectralneighborhoodinformationforcornyieldpredictionwithspatiallaggedmachinelearningmodelingcanneighborhoodinformationoutperformvegetationindices |