Informed-Learning-Guided Visual Question Answering Model of Crop Disease
In contemporary agriculture, experts develop preventative and remedial strategies for various disease stages in diverse crops. Decision-making regarding the stages of disease occurrence exceeds the capabilities of single-image tasks, such as image classification and object detection. Consequently, r...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
American Association for the Advancement of Science (AAAS)
2024-01-01
|
| Series: | Plant Phenomics |
| Online Access: | https://spj.science.org/doi/10.34133/plantphenomics.0277 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850115354746945536 |
|---|---|
| author | Yunpeng Zhao Shansong Wang Qingtian Zeng Weijian Ni Hua Duan Nengfu Xie Fengjin Xiao |
| author_facet | Yunpeng Zhao Shansong Wang Qingtian Zeng Weijian Ni Hua Duan Nengfu Xie Fengjin Xiao |
| author_sort | Yunpeng Zhao |
| collection | DOAJ |
| description | In contemporary agriculture, experts develop preventative and remedial strategies for various disease stages in diverse crops. Decision-making regarding the stages of disease occurrence exceeds the capabilities of single-image tasks, such as image classification and object detection. Consequently, research now focuses on training visual question answering (VQA) models. However, existing studies concentrate on identifying disease species rather than formulating questions that encompass crucial multiattributes. Additionally, model performance is susceptible to the model structure and dataset biases. To address these challenges, we construct the informed-learning-guided VQA model of crop disease (ILCD). ILCD improves model performance by integrating coattention, a multimodal fusion model (MUTAN), and a bias-balancing (BiBa) strategy. To facilitate the investigation of various visual attributes of crop diseases and the determination of disease occurrence stages, we construct a new VQA dataset called the Crop Disease Multi-attribute VQA with Prior Knowledge (CDwPK-VQA). This dataset contains comprehensive information on various visual attributes such as shape, size, status, and color. We expand the dataset by integrating prior knowledge into CDwPK-VQA to address performance challenges. Comparative experiments are conducted by ILCD on the VQA-v2, VQA-CP v2, and CDwPK-VQA datasets, achieving accuracies of 68.90%, 49.75%, and 86.06%, respectively. Ablation experiments are conducted on CDwPK-VQA to evaluate the effectiveness of various modules, including coattention, MUTAN, and BiBa. These experiments demonstrate that ILCD exhibits the highest level of accuracy, performance, and value in the field of agriculture. The source codes can be accessed at https://github.com/SdustZYP/ILCD-master/tree/main. |
| format | Article |
| id | doaj-art-8fbf451e07d6461286eb5290be26ffbe |
| institution | OA Journals |
| issn | 2643-6515 |
| language | English |
| publishDate | 2024-01-01 |
| publisher | American Association for the Advancement of Science (AAAS) |
| record_format | Article |
| series | Plant Phenomics |
| spelling | doaj-art-8fbf451e07d6461286eb5290be26ffbe2025-08-20T02:36:35ZengAmerican Association for the Advancement of Science (AAAS)Plant Phenomics2643-65152024-01-01610.34133/plantphenomics.0277Informed-Learning-Guided Visual Question Answering Model of Crop DiseaseYunpeng Zhao0Shansong Wang1Qingtian Zeng2Weijian Ni3Hua Duan4Nengfu Xie5Fengjin Xiao6College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China.College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China.College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China.College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China.College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China.Agricultural Information Institute of CAAS, Beijing 100081, China.National Climate Center, Beijing 100081, China.In contemporary agriculture, experts develop preventative and remedial strategies for various disease stages in diverse crops. Decision-making regarding the stages of disease occurrence exceeds the capabilities of single-image tasks, such as image classification and object detection. Consequently, research now focuses on training visual question answering (VQA) models. However, existing studies concentrate on identifying disease species rather than formulating questions that encompass crucial multiattributes. Additionally, model performance is susceptible to the model structure and dataset biases. To address these challenges, we construct the informed-learning-guided VQA model of crop disease (ILCD). ILCD improves model performance by integrating coattention, a multimodal fusion model (MUTAN), and a bias-balancing (BiBa) strategy. To facilitate the investigation of various visual attributes of crop diseases and the determination of disease occurrence stages, we construct a new VQA dataset called the Crop Disease Multi-attribute VQA with Prior Knowledge (CDwPK-VQA). This dataset contains comprehensive information on various visual attributes such as shape, size, status, and color. We expand the dataset by integrating prior knowledge into CDwPK-VQA to address performance challenges. Comparative experiments are conducted by ILCD on the VQA-v2, VQA-CP v2, and CDwPK-VQA datasets, achieving accuracies of 68.90%, 49.75%, and 86.06%, respectively. Ablation experiments are conducted on CDwPK-VQA to evaluate the effectiveness of various modules, including coattention, MUTAN, and BiBa. These experiments demonstrate that ILCD exhibits the highest level of accuracy, performance, and value in the field of agriculture. The source codes can be accessed at https://github.com/SdustZYP/ILCD-master/tree/main.https://spj.science.org/doi/10.34133/plantphenomics.0277 |
| spellingShingle | Yunpeng Zhao Shansong Wang Qingtian Zeng Weijian Ni Hua Duan Nengfu Xie Fengjin Xiao Informed-Learning-Guided Visual Question Answering Model of Crop Disease Plant Phenomics |
| title | Informed-Learning-Guided Visual Question Answering Model of Crop Disease |
| title_full | Informed-Learning-Guided Visual Question Answering Model of Crop Disease |
| title_fullStr | Informed-Learning-Guided Visual Question Answering Model of Crop Disease |
| title_full_unstemmed | Informed-Learning-Guided Visual Question Answering Model of Crop Disease |
| title_short | Informed-Learning-Guided Visual Question Answering Model of Crop Disease |
| title_sort | informed learning guided visual question answering model of crop disease |
| url | https://spj.science.org/doi/10.34133/plantphenomics.0277 |
| work_keys_str_mv | AT yunpengzhao informedlearningguidedvisualquestionansweringmodelofcropdisease AT shansongwang informedlearningguidedvisualquestionansweringmodelofcropdisease AT qingtianzeng informedlearningguidedvisualquestionansweringmodelofcropdisease AT weijianni informedlearningguidedvisualquestionansweringmodelofcropdisease AT huaduan informedlearningguidedvisualquestionansweringmodelofcropdisease AT nengfuxie informedlearningguidedvisualquestionansweringmodelofcropdisease AT fengjinxiao informedlearningguidedvisualquestionansweringmodelofcropdisease |