Informed-Learning-Guided Visual Question Answering Model of Crop Disease

In contemporary agriculture, experts develop preventative and remedial strategies for various disease stages in diverse crops. Decision-making regarding the stages of disease occurrence exceeds the capabilities of single-image tasks, such as image classification and object detection. Consequently, r...

Full description

Saved in:
Bibliographic Details
Main Authors: Yunpeng Zhao, Shansong Wang, Qingtian Zeng, Weijian Ni, Hua Duan, Nengfu Xie, Fengjin Xiao
Format: Article
Language:English
Published: American Association for the Advancement of Science (AAAS) 2024-01-01
Series:Plant Phenomics
Online Access:https://spj.science.org/doi/10.34133/plantphenomics.0277
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850115354746945536
author Yunpeng Zhao
Shansong Wang
Qingtian Zeng
Weijian Ni
Hua Duan
Nengfu Xie
Fengjin Xiao
author_facet Yunpeng Zhao
Shansong Wang
Qingtian Zeng
Weijian Ni
Hua Duan
Nengfu Xie
Fengjin Xiao
author_sort Yunpeng Zhao
collection DOAJ
description In contemporary agriculture, experts develop preventative and remedial strategies for various disease stages in diverse crops. Decision-making regarding the stages of disease occurrence exceeds the capabilities of single-image tasks, such as image classification and object detection. Consequently, research now focuses on training visual question answering (VQA) models. However, existing studies concentrate on identifying disease species rather than formulating questions that encompass crucial multiattributes. Additionally, model performance is susceptible to the model structure and dataset biases. To address these challenges, we construct the informed-learning-guided VQA model of crop disease (ILCD). ILCD improves model performance by integrating coattention, a multimodal fusion model (MUTAN), and a bias-balancing (BiBa) strategy. To facilitate the investigation of various visual attributes of crop diseases and the determination of disease occurrence stages, we construct a new VQA dataset called the Crop Disease Multi-attribute VQA with Prior Knowledge (CDwPK-VQA). This dataset contains comprehensive information on various visual attributes such as shape, size, status, and color. We expand the dataset by integrating prior knowledge into CDwPK-VQA to address performance challenges. Comparative experiments are conducted by ILCD on the VQA-v2, VQA-CP v2, and CDwPK-VQA datasets, achieving accuracies of 68.90%, 49.75%, and 86.06%, respectively. Ablation experiments are conducted on CDwPK-VQA to evaluate the effectiveness of various modules, including coattention, MUTAN, and BiBa. These experiments demonstrate that ILCD exhibits the highest level of accuracy, performance, and value in the field of agriculture. The source codes can be accessed at https://github.com/SdustZYP/ILCD-master/tree/main.
format Article
id doaj-art-8fbf451e07d6461286eb5290be26ffbe
institution OA Journals
issn 2643-6515
language English
publishDate 2024-01-01
publisher American Association for the Advancement of Science (AAAS)
record_format Article
series Plant Phenomics
spelling doaj-art-8fbf451e07d6461286eb5290be26ffbe2025-08-20T02:36:35ZengAmerican Association for the Advancement of Science (AAAS)Plant Phenomics2643-65152024-01-01610.34133/plantphenomics.0277Informed-Learning-Guided Visual Question Answering Model of Crop DiseaseYunpeng Zhao0Shansong Wang1Qingtian Zeng2Weijian Ni3Hua Duan4Nengfu Xie5Fengjin Xiao6College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China.College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China.College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China.College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China.College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China.Agricultural Information Institute of CAAS, Beijing 100081, China.National Climate Center, Beijing 100081, China.In contemporary agriculture, experts develop preventative and remedial strategies for various disease stages in diverse crops. Decision-making regarding the stages of disease occurrence exceeds the capabilities of single-image tasks, such as image classification and object detection. Consequently, research now focuses on training visual question answering (VQA) models. However, existing studies concentrate on identifying disease species rather than formulating questions that encompass crucial multiattributes. Additionally, model performance is susceptible to the model structure and dataset biases. To address these challenges, we construct the informed-learning-guided VQA model of crop disease (ILCD). ILCD improves model performance by integrating coattention, a multimodal fusion model (MUTAN), and a bias-balancing (BiBa) strategy. To facilitate the investigation of various visual attributes of crop diseases and the determination of disease occurrence stages, we construct a new VQA dataset called the Crop Disease Multi-attribute VQA with Prior Knowledge (CDwPK-VQA). This dataset contains comprehensive information on various visual attributes such as shape, size, status, and color. We expand the dataset by integrating prior knowledge into CDwPK-VQA to address performance challenges. Comparative experiments are conducted by ILCD on the VQA-v2, VQA-CP v2, and CDwPK-VQA datasets, achieving accuracies of 68.90%, 49.75%, and 86.06%, respectively. Ablation experiments are conducted on CDwPK-VQA to evaluate the effectiveness of various modules, including coattention, MUTAN, and BiBa. These experiments demonstrate that ILCD exhibits the highest level of accuracy, performance, and value in the field of agriculture. The source codes can be accessed at https://github.com/SdustZYP/ILCD-master/tree/main.https://spj.science.org/doi/10.34133/plantphenomics.0277
spellingShingle Yunpeng Zhao
Shansong Wang
Qingtian Zeng
Weijian Ni
Hua Duan
Nengfu Xie
Fengjin Xiao
Informed-Learning-Guided Visual Question Answering Model of Crop Disease
Plant Phenomics
title Informed-Learning-Guided Visual Question Answering Model of Crop Disease
title_full Informed-Learning-Guided Visual Question Answering Model of Crop Disease
title_fullStr Informed-Learning-Guided Visual Question Answering Model of Crop Disease
title_full_unstemmed Informed-Learning-Guided Visual Question Answering Model of Crop Disease
title_short Informed-Learning-Guided Visual Question Answering Model of Crop Disease
title_sort informed learning guided visual question answering model of crop disease
url https://spj.science.org/doi/10.34133/plantphenomics.0277
work_keys_str_mv AT yunpengzhao informedlearningguidedvisualquestionansweringmodelofcropdisease
AT shansongwang informedlearningguidedvisualquestionansweringmodelofcropdisease
AT qingtianzeng informedlearningguidedvisualquestionansweringmodelofcropdisease
AT weijianni informedlearningguidedvisualquestionansweringmodelofcropdisease
AT huaduan informedlearningguidedvisualquestionansweringmodelofcropdisease
AT nengfuxie informedlearningguidedvisualquestionansweringmodelofcropdisease
AT fengjinxiao informedlearningguidedvisualquestionansweringmodelofcropdisease