AI-MedLeafX: a large-scale computer vision dataset for medicinal plant diagnosisMendeley Data

This study presents a large, meticulously curated and manually validated dataset aimed at classifying leaf quality into five critical categories: Healthy, Bacterial Spot, Shot Hole, Yellow, and Powdery Mildew. The dataset encompasses four distinct plant species—Cinnamomum Camphora (Camphor), Termina...

Full description

Saved in:
Bibliographic Details
Main Authors: Md. Fahim Ferdous, Faysal Bin Khaled Nissan, Nur Muhammad Nibir, Md. Hasan Imam Bijoy
Format: Article
Language:English
Published: Elsevier 2025-10-01
Series:Data in Brief
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352340925006699
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849765103655714816
author Md. Fahim Ferdous
Faysal Bin Khaled Nissan
Nur Muhammad Nibir
Md. Hasan Imam Bijoy
author_facet Md. Fahim Ferdous
Faysal Bin Khaled Nissan
Nur Muhammad Nibir
Md. Hasan Imam Bijoy
author_sort Md. Fahim Ferdous
collection DOAJ
description This study presents a large, meticulously curated and manually validated dataset aimed at classifying leaf quality into five critical categories: Healthy, Bacterial Spot, Shot Hole, Yellow, and Powdery Mildew. The dataset encompasses four distinct plant species—Cinnamomum Camphora (Camphor), Terminalia Chebula (Haritaki), Moringa Oleifera (Sojina), and Azadirachta Indica (Neem)—each represented across three or four disease categories, depending on observed symptoms and final number of classes is thirteen (13 classes). Data collection was conducted between November 1, 2024, and January 5, 2025, utilizing four different mobile cameras to ensure diversity in image resolution, lighting, and environmental conditions. The original dataset comprised 10,858 high-resolution images, which were subsequently expanded to 65,148 through the application of six comprehensive data augmentation techniques, including rotations (45°, 60°, and 90°), horizontal flipping, zooming and brightness adjustment. All images were standardized to 512×512 pixels to ensure uniformity and seamless compatibility with machine learning and computer vision models. This enriched dataset serves as a crucial resource for the development of automated plant disease detection systems and supports advancements in precision agriculture. It not only addresses the pressing need for scalable, high-quality data in agricultural research but also establishes a solid foundation for benchmarking novel deep learning architectures. By enabling more accurate and efficient leaf disease classification, the dataset contributes significantly to enhancing tree health monitoring, improving crop yield, and promoting sustainable agricultural practices.
format Article
id doaj-art-39afa3e6992240669d6bbaf94f5efbea
institution DOAJ
issn 2352-3409
language English
publishDate 2025-10-01
publisher Elsevier
record_format Article
series Data in Brief
spelling doaj-art-39afa3e6992240669d6bbaf94f5efbea2025-08-20T03:04:58ZengElsevierData in Brief2352-34092025-10-016211194510.1016/j.dib.2025.111945AI-MedLeafX: a large-scale computer vision dataset for medicinal plant diagnosisMendeley DataMd. Fahim Ferdous0Faysal Bin Khaled Nissan1Nur Muhammad Nibir2Md. Hasan Imam Bijoy3Department of Computer Science and Engineering, Daffodil International University, Dhaka 1216, BangladeshDepartment of Computer Science and Engineering, Daffodil International University, Dhaka 1216, BangladeshDepartment of Computer Science and Engineering, Daffodil International University, Dhaka 1216, BangladeshCorresponding author.; Department of Computer Science and Engineering, Daffodil International University, Dhaka 1216, BangladeshThis study presents a large, meticulously curated and manually validated dataset aimed at classifying leaf quality into five critical categories: Healthy, Bacterial Spot, Shot Hole, Yellow, and Powdery Mildew. The dataset encompasses four distinct plant species—Cinnamomum Camphora (Camphor), Terminalia Chebula (Haritaki), Moringa Oleifera (Sojina), and Azadirachta Indica (Neem)—each represented across three or four disease categories, depending on observed symptoms and final number of classes is thirteen (13 classes). Data collection was conducted between November 1, 2024, and January 5, 2025, utilizing four different mobile cameras to ensure diversity in image resolution, lighting, and environmental conditions. The original dataset comprised 10,858 high-resolution images, which were subsequently expanded to 65,148 through the application of six comprehensive data augmentation techniques, including rotations (45°, 60°, and 90°), horizontal flipping, zooming and brightness adjustment. All images were standardized to 512×512 pixels to ensure uniformity and seamless compatibility with machine learning and computer vision models. This enriched dataset serves as a crucial resource for the development of automated plant disease detection systems and supports advancements in precision agriculture. It not only addresses the pressing need for scalable, high-quality data in agricultural research but also establishes a solid foundation for benchmarking novel deep learning architectures. By enabling more accurate and efficient leaf disease classification, the dataset contributes significantly to enhancing tree health monitoring, improving crop yield, and promoting sustainable agricultural practices.http://www.sciencedirect.com/science/article/pii/S2352340925006699Medicinal leaf datasetMedicinal leaf classificationAgriculture informaticsMachine learningComputer visionAgriculture
spellingShingle Md. Fahim Ferdous
Faysal Bin Khaled Nissan
Nur Muhammad Nibir
Md. Hasan Imam Bijoy
AI-MedLeafX: a large-scale computer vision dataset for medicinal plant diagnosisMendeley Data
Data in Brief
Medicinal leaf dataset
Medicinal leaf classification
Agriculture informatics
Machine learning
Computer vision
Agriculture
title AI-MedLeafX: a large-scale computer vision dataset for medicinal plant diagnosisMendeley Data
title_full AI-MedLeafX: a large-scale computer vision dataset for medicinal plant diagnosisMendeley Data
title_fullStr AI-MedLeafX: a large-scale computer vision dataset for medicinal plant diagnosisMendeley Data
title_full_unstemmed AI-MedLeafX: a large-scale computer vision dataset for medicinal plant diagnosisMendeley Data
title_short AI-MedLeafX: a large-scale computer vision dataset for medicinal plant diagnosisMendeley Data
title_sort ai medleafx a large scale computer vision dataset for medicinal plant diagnosismendeley data
topic Medicinal leaf dataset
Medicinal leaf classification
Agriculture informatics
Machine learning
Computer vision
Agriculture
url http://www.sciencedirect.com/science/article/pii/S2352340925006699
work_keys_str_mv AT mdfahimferdous aimedleafxalargescalecomputervisiondatasetformedicinalplantdiagnosismendeleydata
AT faysalbinkhalednissan aimedleafxalargescalecomputervisiondatasetformedicinalplantdiagnosismendeleydata
AT nurmuhammadnibir aimedleafxalargescalecomputervisiondatasetformedicinalplantdiagnosismendeleydata
AT mdhasanimambijoy aimedleafxalargescalecomputervisiondatasetformedicinalplantdiagnosismendeleydata