PlanText: Gradually Masked Guidance to Align Image Phenotypes with Trait Descriptions for Plant Disease Texts

Plant diseases are a critical driver of the global food crisis. The integration of advanced artificial intelligence technologies can substantially enhance plant disease diagnostics. However, current methods for early and complex detection remain challenging. Employing multimodal technologies, akin t...

Full description

Saved in:
Bibliographic Details
Main Authors: Kejun Zhao, Xingcai Wu, Yuanyuan Xiao, Sijun Jiang, Peijia Yu, Yazhou Wang, Qi Wang
Format: Article
Language:English
Published: American Association for the Advancement of Science (AAAS) 2024-01-01
Series:Plant Phenomics
Online Access:https://spj.science.org/doi/10.34133/plantphenomics.0272
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850160314020003840
author Kejun Zhao
Xingcai Wu
Yuanyuan Xiao
Sijun Jiang
Peijia Yu
Yazhou Wang
Qi Wang
author_facet Kejun Zhao
Xingcai Wu
Yuanyuan Xiao
Sijun Jiang
Peijia Yu
Yazhou Wang
Qi Wang
author_sort Kejun Zhao
collection DOAJ
description Plant diseases are a critical driver of the global food crisis. The integration of advanced artificial intelligence technologies can substantially enhance plant disease diagnostics. However, current methods for early and complex detection remain challenging. Employing multimodal technologies, akin to medical artificial intelligence diagnostics that combine diverse data types, may offer a more effective solution. Presently, the reliance on single-modal data predominates in plant disease research, which limits the scope for early and detailed diagnosis. Consequently, developing text modality generation techniques is essential for overcoming the limitations in plant disease recognition. To this end, we propose a method for aligning plant phenotypes with trait descriptions, which diagnoses text by progressively masking disease images. First, for training and validation, we annotate 5,728 disease phenotype images with expert diagnostic text and provide annotated text and trait labels for 210,000 disease images. Then, we propose a PhenoTrait text description model, which consists of global and heterogeneous feature encoders as well as switching-attention decoders, for accurate context-aware output. Next, to generate a more phenotypically appropriate description, we adopt 3 stages of embedding image features into semantic structures, which generate characterizations that preserve trait features. Finally, our experimental results show that our model outperforms several frontier models in multiple trait descriptions, including the larger models GPT-4 and GPT-4o. Our code and dataset are available at https://plantext.samlab.cn/.
format Article
id doaj-art-e33bf9a3f5cb4630bf64b0c98a708520
institution OA Journals
issn 2643-6515
language English
publishDate 2024-01-01
publisher American Association for the Advancement of Science (AAAS)
record_format Article
series Plant Phenomics
spelling doaj-art-e33bf9a3f5cb4630bf64b0c98a7085202025-08-20T02:23:11ZengAmerican Association for the Advancement of Science (AAAS)Plant Phenomics2643-65152024-01-01610.34133/plantphenomics.0272PlanText: Gradually Masked Guidance to Align Image Phenotypes with Trait Descriptions for Plant Disease TextsKejun Zhao0Xingcai Wu1Yuanyuan Xiao2Sijun Jiang3Peijia Yu4Yazhou Wang5Qi Wang6State Key Laboratory of Public Big Data, School of Computer Science and Technology, Guizhou University, Guiyang 550025, China.State Key Laboratory of Public Big Data, School of Computer Science and Technology, Guizhou University, Guiyang 550025, China.State Key Laboratory of Public Big Data, School of Computer Science and Technology, Guizhou University, Guiyang 550025, China.State Key Laboratory of Public Big Data, School of Computer Science and Technology, Guizhou University, Guiyang 550025, China.State Key Laboratory of Public Big Data, School of Computer Science and Technology, Guizhou University, Guiyang 550025, China.School of Information, Guizhou University of Finance and Economics, Guiyang 550025, China.State Key Laboratory of Public Big Data, School of Computer Science and Technology, Guizhou University, Guiyang 550025, China.Plant diseases are a critical driver of the global food crisis. The integration of advanced artificial intelligence technologies can substantially enhance plant disease diagnostics. However, current methods for early and complex detection remain challenging. Employing multimodal technologies, akin to medical artificial intelligence diagnostics that combine diverse data types, may offer a more effective solution. Presently, the reliance on single-modal data predominates in plant disease research, which limits the scope for early and detailed diagnosis. Consequently, developing text modality generation techniques is essential for overcoming the limitations in plant disease recognition. To this end, we propose a method for aligning plant phenotypes with trait descriptions, which diagnoses text by progressively masking disease images. First, for training and validation, we annotate 5,728 disease phenotype images with expert diagnostic text and provide annotated text and trait labels for 210,000 disease images. Then, we propose a PhenoTrait text description model, which consists of global and heterogeneous feature encoders as well as switching-attention decoders, for accurate context-aware output. Next, to generate a more phenotypically appropriate description, we adopt 3 stages of embedding image features into semantic structures, which generate characterizations that preserve trait features. Finally, our experimental results show that our model outperforms several frontier models in multiple trait descriptions, including the larger models GPT-4 and GPT-4o. Our code and dataset are available at https://plantext.samlab.cn/.https://spj.science.org/doi/10.34133/plantphenomics.0272
spellingShingle Kejun Zhao
Xingcai Wu
Yuanyuan Xiao
Sijun Jiang
Peijia Yu
Yazhou Wang
Qi Wang
PlanText: Gradually Masked Guidance to Align Image Phenotypes with Trait Descriptions for Plant Disease Texts
Plant Phenomics
title PlanText: Gradually Masked Guidance to Align Image Phenotypes with Trait Descriptions for Plant Disease Texts
title_full PlanText: Gradually Masked Guidance to Align Image Phenotypes with Trait Descriptions for Plant Disease Texts
title_fullStr PlanText: Gradually Masked Guidance to Align Image Phenotypes with Trait Descriptions for Plant Disease Texts
title_full_unstemmed PlanText: Gradually Masked Guidance to Align Image Phenotypes with Trait Descriptions for Plant Disease Texts
title_short PlanText: Gradually Masked Guidance to Align Image Phenotypes with Trait Descriptions for Plant Disease Texts
title_sort plantext gradually masked guidance to align image phenotypes with trait descriptions for plant disease texts
url https://spj.science.org/doi/10.34133/plantphenomics.0272
work_keys_str_mv AT kejunzhao plantextgraduallymaskedguidancetoalignimagephenotypeswithtraitdescriptionsforplantdiseasetexts
AT xingcaiwu plantextgraduallymaskedguidancetoalignimagephenotypeswithtraitdescriptionsforplantdiseasetexts
AT yuanyuanxiao plantextgraduallymaskedguidancetoalignimagephenotypeswithtraitdescriptionsforplantdiseasetexts
AT sijunjiang plantextgraduallymaskedguidancetoalignimagephenotypeswithtraitdescriptionsforplantdiseasetexts
AT peijiayu plantextgraduallymaskedguidancetoalignimagephenotypeswithtraitdescriptionsforplantdiseasetexts
AT yazhouwang plantextgraduallymaskedguidancetoalignimagephenotypeswithtraitdescriptionsforplantdiseasetexts
AT qiwang plantextgraduallymaskedguidancetoalignimagephenotypeswithtraitdescriptionsforplantdiseasetexts