Sorghum yield prediction based on remote sensing and machine learning in conflict affected South Sudan

Abstract Sorghum cultivation plays a pivotal role in addressing food insecurity in South Sudan, but persistent conflict continues to impose challenges in the agriculture sector therefore understanding the impact of conflict on sorghum yield prediction is important for country food security. This res...

Full description

Saved in:
Bibliographic Details
Main Authors: John Karongo, Joseph Ivivi Mwaniki, John Ndiritu, Victor Mokaya
Format: Article
Language:English
Published: Nature Portfolio 2025-02-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-89030-z
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Sorghum cultivation plays a pivotal role in addressing food insecurity in South Sudan, but persistent conflict continues to impose challenges in the agriculture sector therefore understanding the impact of conflict on sorghum yield prediction is important for country food security. This research integrates various sources of data including sorghum yield from small scale farmers during four agricultural seasons (2018–2021), climate, remotely sensed data, and conflict occurrence probability to predict sorghum yield in South Sudan. We use five Machine Learning (ML) techniques, including Random Forest (RF), Decision Tree (DT), Extreme Gradient Boosting (XGboost), Support Vector Machine (SVM) and Artificial Neural Network (ANN) to predict 2021 end-of-season sorghum yield in conflict affected Upper Nile and Western Bahr El Gazal states. We computed correlations and the analysis revealed high variability in term of yield in the 2 states with an average sorghum yield of 366.03 kg/ha (SD = 292.29 kg/ha) and a strong positive correlation (0.75, p < 0.001) between cultivated land size and sorghum yield. During the training phase DT, RF, XGboost and ANN models showed high accuracy, with each having an R2 > 70%. DT and XGboost both had an accuracy close to 80% and less prediction error. Predicting 2021 sorghum yield, XGboost, DT and RF models yielded best combination of metrics with good accuracy. Our results reveal that adding conflict occurrence probability data to the models, while complex, had minimal impact on yield predictions. Further analysis revealed cultivated land size was the most significant predictor for all the models. This paper demonstrates that despite ongoing conflict, reasonably good end-of-season sorghum yield prediction with relevant food security planning implications could be done with ML, but challenges remain in generalizing these results due to limited crop data and regional variability in South Sudan.
ISSN:2045-2322