Global de-trending significantly improves the accuracy of XGBoost-based county-level maize and soybean yield prediction in the Midwestern United States
The application of machine learning in crop yield prediction has gained considerable traction, yet uncertainties persist regarding the impact of the yield trends on these predictions and the differences between the detrending methods. In our study, we utilized extreme gradient boosting (XGBoost) to...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Taylor & Francis Group
2024-12-01
|
| Series: | GIScience & Remote Sensing |
| Subjects: | |
| Online Access: | https://www.tandfonline.com/doi/10.1080/15481603.2024.2349341 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850176554181591040 |
|---|---|
| author | Yuanchao Li Hongwei Zeng Miao Zhang Bingfang Wu Xingli Qin |
| author_facet | Yuanchao Li Hongwei Zeng Miao Zhang Bingfang Wu Xingli Qin |
| author_sort | Yuanchao Li |
| collection | DOAJ |
| description | The application of machine learning in crop yield prediction has gained considerable traction, yet uncertainties persist regarding the impact of the yield trends on these predictions and the differences between the detrending methods. In our study, we utilized extreme gradient boosting (XGBoost) to scrutinize the effects of no trend processing (NTP), input year as a feature (IYF), input average yield as a feature (IAYF), input linear yield as a feature (ILYF), and the global detrending method (GDT) on the yield prediction of maize and soybean in the Midwestern United States. Based on our findings, compared with that of NTP, the incorporation of the yield trend as a predictor in XGBoost significantly improved the accuracy and reduced the uncertainty of the yield prediction. Notably, GDT emerged as a standout performer, significantly reducing the average yield prediction error by 0.091 t/ha for soybean and 0.158 t/ha for maize with respect to NTP, and concurrently improving the determination coefficient (R2) by 20.6% and 19.6% for soybean and maize, respectively. Compared with IYF, IAYF, and ILYF, GDT showed substantial improvements ranging from 3.8% to 12.7% in R2 for soybean and 3.6% to 12.7% for maize. The SHapley Additive ExPlanations (SHAP) framework showed that the enhanced vegetation index (EVI), particularly during the soybean podding and maize dough formation stages, played a crucial role in understanding the variations in interannual yield variability. These findings confirmed the importance of GDT in crop yield prediction via machine learning and could be used to facilitate future advancements in machine learning applications for yield forecasting. |
| format | Article |
| id | doaj-art-137732e4501445318efb1e4d2ff7dd1b |
| institution | OA Journals |
| issn | 1548-1603 1943-7226 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | Taylor & Francis Group |
| record_format | Article |
| series | GIScience & Remote Sensing |
| spelling | doaj-art-137732e4501445318efb1e4d2ff7dd1b2025-08-20T02:19:14ZengTaylor & Francis GroupGIScience & Remote Sensing1548-16031943-72262024-12-0161110.1080/15481603.2024.2349341Global de-trending significantly improves the accuracy of XGBoost-based county-level maize and soybean yield prediction in the Midwestern United StatesYuanchao Li0Hongwei Zeng1Miao Zhang2Bingfang Wu3Xingli Qin4State Key Laboratory of Remote Sensing Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaState Key Laboratory of Remote Sensing Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaState Key Laboratory of Remote Sensing Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaState Key Laboratory of Remote Sensing Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaState Key Laboratory of Remote Sensing Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaThe application of machine learning in crop yield prediction has gained considerable traction, yet uncertainties persist regarding the impact of the yield trends on these predictions and the differences between the detrending methods. In our study, we utilized extreme gradient boosting (XGBoost) to scrutinize the effects of no trend processing (NTP), input year as a feature (IYF), input average yield as a feature (IAYF), input linear yield as a feature (ILYF), and the global detrending method (GDT) on the yield prediction of maize and soybean in the Midwestern United States. Based on our findings, compared with that of NTP, the incorporation of the yield trend as a predictor in XGBoost significantly improved the accuracy and reduced the uncertainty of the yield prediction. Notably, GDT emerged as a standout performer, significantly reducing the average yield prediction error by 0.091 t/ha for soybean and 0.158 t/ha for maize with respect to NTP, and concurrently improving the determination coefficient (R2) by 20.6% and 19.6% for soybean and maize, respectively. Compared with IYF, IAYF, and ILYF, GDT showed substantial improvements ranging from 3.8% to 12.7% in R2 for soybean and 3.6% to 12.7% for maize. The SHapley Additive ExPlanations (SHAP) framework showed that the enhanced vegetation index (EVI), particularly during the soybean podding and maize dough formation stages, played a crucial role in understanding the variations in interannual yield variability. These findings confirmed the importance of GDT in crop yield prediction via machine learning and could be used to facilitate future advancements in machine learning applications for yield forecasting.https://www.tandfonline.com/doi/10.1080/15481603.2024.2349341Maize and soybeanYield detrendingYield predictionXgboostSHAPUS midwest |
| spellingShingle | Yuanchao Li Hongwei Zeng Miao Zhang Bingfang Wu Xingli Qin Global de-trending significantly improves the accuracy of XGBoost-based county-level maize and soybean yield prediction in the Midwestern United States GIScience & Remote Sensing Maize and soybean Yield detrending Yield prediction Xgboost SHAP US midwest |
| title | Global de-trending significantly improves the accuracy of XGBoost-based county-level maize and soybean yield prediction in the Midwestern United States |
| title_full | Global de-trending significantly improves the accuracy of XGBoost-based county-level maize and soybean yield prediction in the Midwestern United States |
| title_fullStr | Global de-trending significantly improves the accuracy of XGBoost-based county-level maize and soybean yield prediction in the Midwestern United States |
| title_full_unstemmed | Global de-trending significantly improves the accuracy of XGBoost-based county-level maize and soybean yield prediction in the Midwestern United States |
| title_short | Global de-trending significantly improves the accuracy of XGBoost-based county-level maize and soybean yield prediction in the Midwestern United States |
| title_sort | global de trending significantly improves the accuracy of xgboost based county level maize and soybean yield prediction in the midwestern united states |
| topic | Maize and soybean Yield detrending Yield prediction Xgboost SHAP US midwest |
| url | https://www.tandfonline.com/doi/10.1080/15481603.2024.2349341 |
| work_keys_str_mv | AT yuanchaoli globaldetrendingsignificantlyimprovestheaccuracyofxgboostbasedcountylevelmaizeandsoybeanyieldpredictioninthemidwesternunitedstates AT hongweizeng globaldetrendingsignificantlyimprovestheaccuracyofxgboostbasedcountylevelmaizeandsoybeanyieldpredictioninthemidwesternunitedstates AT miaozhang globaldetrendingsignificantlyimprovestheaccuracyofxgboostbasedcountylevelmaizeandsoybeanyieldpredictioninthemidwesternunitedstates AT bingfangwu globaldetrendingsignificantlyimprovestheaccuracyofxgboostbasedcountylevelmaizeandsoybeanyieldpredictioninthemidwesternunitedstates AT xingliqin globaldetrendingsignificantlyimprovestheaccuracyofxgboostbasedcountylevelmaizeandsoybeanyieldpredictioninthemidwesternunitedstates |