AI-MedLeafX: a large-scale computer vision dataset for medicinal plant diagnosisMendeley Data
This study presents a large, meticulously curated and manually validated dataset aimed at classifying leaf quality into five critical categories: Healthy, Bacterial Spot, Shot Hole, Yellow, and Powdery Mildew. The dataset encompasses four distinct plant species—Cinnamomum Camphora (Camphor), Termina...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-10-01
|
| Series: | Data in Brief |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2352340925006699 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849765103655714816 |
|---|---|
| author | Md. Fahim Ferdous Faysal Bin Khaled Nissan Nur Muhammad Nibir Md. Hasan Imam Bijoy |
| author_facet | Md. Fahim Ferdous Faysal Bin Khaled Nissan Nur Muhammad Nibir Md. Hasan Imam Bijoy |
| author_sort | Md. Fahim Ferdous |
| collection | DOAJ |
| description | This study presents a large, meticulously curated and manually validated dataset aimed at classifying leaf quality into five critical categories: Healthy, Bacterial Spot, Shot Hole, Yellow, and Powdery Mildew. The dataset encompasses four distinct plant species—Cinnamomum Camphora (Camphor), Terminalia Chebula (Haritaki), Moringa Oleifera (Sojina), and Azadirachta Indica (Neem)—each represented across three or four disease categories, depending on observed symptoms and final number of classes is thirteen (13 classes). Data collection was conducted between November 1, 2024, and January 5, 2025, utilizing four different mobile cameras to ensure diversity in image resolution, lighting, and environmental conditions. The original dataset comprised 10,858 high-resolution images, which were subsequently expanded to 65,148 through the application of six comprehensive data augmentation techniques, including rotations (45°, 60°, and 90°), horizontal flipping, zooming and brightness adjustment. All images were standardized to 512×512 pixels to ensure uniformity and seamless compatibility with machine learning and computer vision models. This enriched dataset serves as a crucial resource for the development of automated plant disease detection systems and supports advancements in precision agriculture. It not only addresses the pressing need for scalable, high-quality data in agricultural research but also establishes a solid foundation for benchmarking novel deep learning architectures. By enabling more accurate and efficient leaf disease classification, the dataset contributes significantly to enhancing tree health monitoring, improving crop yield, and promoting sustainable agricultural practices. |
| format | Article |
| id | doaj-art-39afa3e6992240669d6bbaf94f5efbea |
| institution | DOAJ |
| issn | 2352-3409 |
| language | English |
| publishDate | 2025-10-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Data in Brief |
| spelling | doaj-art-39afa3e6992240669d6bbaf94f5efbea2025-08-20T03:04:58ZengElsevierData in Brief2352-34092025-10-016211194510.1016/j.dib.2025.111945AI-MedLeafX: a large-scale computer vision dataset for medicinal plant diagnosisMendeley DataMd. Fahim Ferdous0Faysal Bin Khaled Nissan1Nur Muhammad Nibir2Md. Hasan Imam Bijoy3Department of Computer Science and Engineering, Daffodil International University, Dhaka 1216, BangladeshDepartment of Computer Science and Engineering, Daffodil International University, Dhaka 1216, BangladeshDepartment of Computer Science and Engineering, Daffodil International University, Dhaka 1216, BangladeshCorresponding author.; Department of Computer Science and Engineering, Daffodil International University, Dhaka 1216, BangladeshThis study presents a large, meticulously curated and manually validated dataset aimed at classifying leaf quality into five critical categories: Healthy, Bacterial Spot, Shot Hole, Yellow, and Powdery Mildew. The dataset encompasses four distinct plant species—Cinnamomum Camphora (Camphor), Terminalia Chebula (Haritaki), Moringa Oleifera (Sojina), and Azadirachta Indica (Neem)—each represented across three or four disease categories, depending on observed symptoms and final number of classes is thirteen (13 classes). Data collection was conducted between November 1, 2024, and January 5, 2025, utilizing four different mobile cameras to ensure diversity in image resolution, lighting, and environmental conditions. The original dataset comprised 10,858 high-resolution images, which were subsequently expanded to 65,148 through the application of six comprehensive data augmentation techniques, including rotations (45°, 60°, and 90°), horizontal flipping, zooming and brightness adjustment. All images were standardized to 512×512 pixels to ensure uniformity and seamless compatibility with machine learning and computer vision models. This enriched dataset serves as a crucial resource for the development of automated plant disease detection systems and supports advancements in precision agriculture. It not only addresses the pressing need for scalable, high-quality data in agricultural research but also establishes a solid foundation for benchmarking novel deep learning architectures. By enabling more accurate and efficient leaf disease classification, the dataset contributes significantly to enhancing tree health monitoring, improving crop yield, and promoting sustainable agricultural practices.http://www.sciencedirect.com/science/article/pii/S2352340925006699Medicinal leaf datasetMedicinal leaf classificationAgriculture informaticsMachine learningComputer visionAgriculture |
| spellingShingle | Md. Fahim Ferdous Faysal Bin Khaled Nissan Nur Muhammad Nibir Md. Hasan Imam Bijoy AI-MedLeafX: a large-scale computer vision dataset for medicinal plant diagnosisMendeley Data Data in Brief Medicinal leaf dataset Medicinal leaf classification Agriculture informatics Machine learning Computer vision Agriculture |
| title | AI-MedLeafX: a large-scale computer vision dataset for medicinal plant diagnosisMendeley Data |
| title_full | AI-MedLeafX: a large-scale computer vision dataset for medicinal plant diagnosisMendeley Data |
| title_fullStr | AI-MedLeafX: a large-scale computer vision dataset for medicinal plant diagnosisMendeley Data |
| title_full_unstemmed | AI-MedLeafX: a large-scale computer vision dataset for medicinal plant diagnosisMendeley Data |
| title_short | AI-MedLeafX: a large-scale computer vision dataset for medicinal plant diagnosisMendeley Data |
| title_sort | ai medleafx a large scale computer vision dataset for medicinal plant diagnosismendeley data |
| topic | Medicinal leaf dataset Medicinal leaf classification Agriculture informatics Machine learning Computer vision Agriculture |
| url | http://www.sciencedirect.com/science/article/pii/S2352340925006699 |
| work_keys_str_mv | AT mdfahimferdous aimedleafxalargescalecomputervisiondatasetformedicinalplantdiagnosismendeleydata AT faysalbinkhalednissan aimedleafxalargescalecomputervisiondatasetformedicinalplantdiagnosismendeleydata AT nurmuhammadnibir aimedleafxalargescalecomputervisiondatasetformedicinalplantdiagnosismendeleydata AT mdhasanimambijoy aimedleafxalargescalecomputervisiondatasetformedicinalplantdiagnosismendeleydata |