Crop Classification and Yield Prediction Using Robust Machine Learning Models for Agricultural Sustainability

Agriculture is pivotal for the economy of a country as it is a major source of food, employment and raw materials. However, challenges such as diseases, soil degradation, and water scarcity persist. Technology adoption can address these issues, improving production and quality. Machine learning, a s...

Full description

Saved in:
Bibliographic Details
Main Authors: Abid Badshah, Basem Yousef Alkazemi, Fakhrud Din, Kamal Z. Zamli, Muhammad Haris
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10735786/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850054253140246528
author Abid Badshah
Basem Yousef Alkazemi
Fakhrud Din
Kamal Z. Zamli
Muhammad Haris
author_facet Abid Badshah
Basem Yousef Alkazemi
Fakhrud Din
Kamal Z. Zamli
Muhammad Haris
author_sort Abid Badshah
collection DOAJ
description Agriculture is pivotal for the economy of a country as it is a major source of food, employment and raw materials. However, challenges such as diseases, soil degradation, and water scarcity persist. Technology adoption can address these issues, improving production and quality. Machine learning, a subset of Artificial Intelligence (AI), enables prediction, classification, and automation in agriculture. It optimizes irrigation, fertilization, and crop selection, aiding decision-making for food security and crop management. This study proposes two robust machine learning architectures for classification and regression based on distinct datasets. Firstly, we delve into a crop recommendation dataset obtained from Kaggle, consisting of various input attributes such as the pH of the soil, temperature, humidity, and nutrient levels. Leveraging machine learning classification techniques such as Extra Tree Classifier (ETC), Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), K-Nearest Neighbour (KNN), Gaussian Naive Bayes (GNB), and Support Vector Machine (SVM), we suggest twenty-two different crops founded on these inputs. Through the use of K-fold cross-validation, Explainable AI (XAI) and feature engineering, we identify the best-performing model, with Random Forest coming out on top scoring an accuracy of 99.7% with precision, recall, F1 score, and confusion matrix. Secondly, we investigate wheat yield prediction data snagged from the World Bank and Food and Agriculture Organization (FAO), covering the years 1992-2013 for Pakistan. Using Multivariate Imputation by Chained Equations (MICE) to tackle data restrictions, we gauge wheat production for 2014-2024 and forecast the 2025 yield using machine learning regression models. Once again, using hyper parameter tuning with K-fold cross-validation, Support Vector Regressor (SVR) stands out as the top-performing model, achieving an accuracy of 99.9% with R2 Score. Transparency and confidence in agricultural decision-making are increased when machine learning decisions are made comprehensible using Explainable AI (XAI) approaches. Two widely used XAI approaches, namely Feature Importance and Local Interpretable Model-Agnostic Explanations (LIME) are used to interpret and explain outcomes of the proposed models. The study can increase agricultural productivity, minimize risks, enhance food security, and promote more environmentally friendly farming approaches.
format Article
id doaj-art-a6c81bfa96a4431bb68f0a22d26ad3fb
institution DOAJ
issn 2169-3536
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-a6c81bfa96a4431bb68f0a22d26ad3fb2025-08-20T02:52:19ZengIEEEIEEE Access2169-35362024-01-011216279916281310.1109/ACCESS.2024.348665310735786Crop Classification and Yield Prediction Using Robust Machine Learning Models for Agricultural SustainabilityAbid Badshah0https://orcid.org/0009-0001-7962-9471Basem Yousef Alkazemi1https://orcid.org/0000-0003-1260-6811Fakhrud Din2https://orcid.org/0000-0001-5025-3223Kamal Z. Zamli3https://orcid.org/0000-0003-4626-0513Muhammad Haris4Department of Computer Science and IT, Faculty of Information Technology (IT), University of Malakand, Dir, Lower, Chakdara, Khyber Pakhtunkhwa, PakistanDepartment of Software Engineering, College of Computing, Umm Al-Qura University, Makkah, Saudi ArabiaDepartment of Computer Science and IT, Faculty of Information Technology (IT), University of Malakand, Dir, Lower, Chakdara, Khyber Pakhtunkhwa, PakistanFaculty of Computing, Universiti Malaysia Pahang Al-Sultan Abdullah (UMPSA), Pekan, Kuantan, Pahang, MalaysiaDepartment of Computer Science and Bioinformatics, Khushal Khan Khattak University, Karak, PakistanAgriculture is pivotal for the economy of a country as it is a major source of food, employment and raw materials. However, challenges such as diseases, soil degradation, and water scarcity persist. Technology adoption can address these issues, improving production and quality. Machine learning, a subset of Artificial Intelligence (AI), enables prediction, classification, and automation in agriculture. It optimizes irrigation, fertilization, and crop selection, aiding decision-making for food security and crop management. This study proposes two robust machine learning architectures for classification and regression based on distinct datasets. Firstly, we delve into a crop recommendation dataset obtained from Kaggle, consisting of various input attributes such as the pH of the soil, temperature, humidity, and nutrient levels. Leveraging machine learning classification techniques such as Extra Tree Classifier (ETC), Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), K-Nearest Neighbour (KNN), Gaussian Naive Bayes (GNB), and Support Vector Machine (SVM), we suggest twenty-two different crops founded on these inputs. Through the use of K-fold cross-validation, Explainable AI (XAI) and feature engineering, we identify the best-performing model, with Random Forest coming out on top scoring an accuracy of 99.7% with precision, recall, F1 score, and confusion matrix. Secondly, we investigate wheat yield prediction data snagged from the World Bank and Food and Agriculture Organization (FAO), covering the years 1992-2013 for Pakistan. Using Multivariate Imputation by Chained Equations (MICE) to tackle data restrictions, we gauge wheat production for 2014-2024 and forecast the 2025 yield using machine learning regression models. Once again, using hyper parameter tuning with K-fold cross-validation, Support Vector Regressor (SVR) stands out as the top-performing model, achieving an accuracy of 99.9% with R2 Score. Transparency and confidence in agricultural decision-making are increased when machine learning decisions are made comprehensible using Explainable AI (XAI) approaches. Two widely used XAI approaches, namely Feature Importance and Local Interpretable Model-Agnostic Explanations (LIME) are used to interpret and explain outcomes of the proposed models. The study can increase agricultural productivity, minimize risks, enhance food security, and promote more environmentally friendly farming approaches.https://ieeexplore.ieee.org/document/10735786/Agricultural planningcrop recommendationcrop yield forecastingexplainable AIK-fold cross-validationmachine learning
spellingShingle Abid Badshah
Basem Yousef Alkazemi
Fakhrud Din
Kamal Z. Zamli
Muhammad Haris
Crop Classification and Yield Prediction Using Robust Machine Learning Models for Agricultural Sustainability
IEEE Access
Agricultural planning
crop recommendation
crop yield forecasting
explainable AI
K-fold cross-validation
machine learning
title Crop Classification and Yield Prediction Using Robust Machine Learning Models for Agricultural Sustainability
title_full Crop Classification and Yield Prediction Using Robust Machine Learning Models for Agricultural Sustainability
title_fullStr Crop Classification and Yield Prediction Using Robust Machine Learning Models for Agricultural Sustainability
title_full_unstemmed Crop Classification and Yield Prediction Using Robust Machine Learning Models for Agricultural Sustainability
title_short Crop Classification and Yield Prediction Using Robust Machine Learning Models for Agricultural Sustainability
title_sort crop classification and yield prediction using robust machine learning models for agricultural sustainability
topic Agricultural planning
crop recommendation
crop yield forecasting
explainable AI
K-fold cross-validation
machine learning
url https://ieeexplore.ieee.org/document/10735786/
work_keys_str_mv AT abidbadshah cropclassificationandyieldpredictionusingrobustmachinelearningmodelsforagriculturalsustainability
AT basemyousefalkazemi cropclassificationandyieldpredictionusingrobustmachinelearningmodelsforagriculturalsustainability
AT fakhruddin cropclassificationandyieldpredictionusingrobustmachinelearningmodelsforagriculturalsustainability
AT kamalzzamli cropclassificationandyieldpredictionusingrobustmachinelearningmodelsforagriculturalsustainability
AT muhammadharis cropclassificationandyieldpredictionusingrobustmachinelearningmodelsforagriculturalsustainability