PlanText: Gradually Masked Guidance to Align Image Phenotypes with Trait Descriptions for Plant Disease Texts
Plant diseases are a critical driver of the global food crisis. The integration of advanced artificial intelligence technologies can substantially enhance plant disease diagnostics. However, current methods for early and complex detection remain challenging. Employing multimodal technologies, akin t...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
American Association for the Advancement of Science (AAAS)
2024-01-01
|
| Series: | Plant Phenomics |
| Online Access: | https://spj.science.org/doi/10.34133/plantphenomics.0272 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850160314020003840 |
|---|---|
| author | Kejun Zhao Xingcai Wu Yuanyuan Xiao Sijun Jiang Peijia Yu Yazhou Wang Qi Wang |
| author_facet | Kejun Zhao Xingcai Wu Yuanyuan Xiao Sijun Jiang Peijia Yu Yazhou Wang Qi Wang |
| author_sort | Kejun Zhao |
| collection | DOAJ |
| description | Plant diseases are a critical driver of the global food crisis. The integration of advanced artificial intelligence technologies can substantially enhance plant disease diagnostics. However, current methods for early and complex detection remain challenging. Employing multimodal technologies, akin to medical artificial intelligence diagnostics that combine diverse data types, may offer a more effective solution. Presently, the reliance on single-modal data predominates in plant disease research, which limits the scope for early and detailed diagnosis. Consequently, developing text modality generation techniques is essential for overcoming the limitations in plant disease recognition. To this end, we propose a method for aligning plant phenotypes with trait descriptions, which diagnoses text by progressively masking disease images. First, for training and validation, we annotate 5,728 disease phenotype images with expert diagnostic text and provide annotated text and trait labels for 210,000 disease images. Then, we propose a PhenoTrait text description model, which consists of global and heterogeneous feature encoders as well as switching-attention decoders, for accurate context-aware output. Next, to generate a more phenotypically appropriate description, we adopt 3 stages of embedding image features into semantic structures, which generate characterizations that preserve trait features. Finally, our experimental results show that our model outperforms several frontier models in multiple trait descriptions, including the larger models GPT-4 and GPT-4o. Our code and dataset are available at https://plantext.samlab.cn/. |
| format | Article |
| id | doaj-art-e33bf9a3f5cb4630bf64b0c98a708520 |
| institution | OA Journals |
| issn | 2643-6515 |
| language | English |
| publishDate | 2024-01-01 |
| publisher | American Association for the Advancement of Science (AAAS) |
| record_format | Article |
| series | Plant Phenomics |
| spelling | doaj-art-e33bf9a3f5cb4630bf64b0c98a7085202025-08-20T02:23:11ZengAmerican Association for the Advancement of Science (AAAS)Plant Phenomics2643-65152024-01-01610.34133/plantphenomics.0272PlanText: Gradually Masked Guidance to Align Image Phenotypes with Trait Descriptions for Plant Disease TextsKejun Zhao0Xingcai Wu1Yuanyuan Xiao2Sijun Jiang3Peijia Yu4Yazhou Wang5Qi Wang6State Key Laboratory of Public Big Data, School of Computer Science and Technology, Guizhou University, Guiyang 550025, China.State Key Laboratory of Public Big Data, School of Computer Science and Technology, Guizhou University, Guiyang 550025, China.State Key Laboratory of Public Big Data, School of Computer Science and Technology, Guizhou University, Guiyang 550025, China.State Key Laboratory of Public Big Data, School of Computer Science and Technology, Guizhou University, Guiyang 550025, China.State Key Laboratory of Public Big Data, School of Computer Science and Technology, Guizhou University, Guiyang 550025, China.School of Information, Guizhou University of Finance and Economics, Guiyang 550025, China.State Key Laboratory of Public Big Data, School of Computer Science and Technology, Guizhou University, Guiyang 550025, China.Plant diseases are a critical driver of the global food crisis. The integration of advanced artificial intelligence technologies can substantially enhance plant disease diagnostics. However, current methods for early and complex detection remain challenging. Employing multimodal technologies, akin to medical artificial intelligence diagnostics that combine diverse data types, may offer a more effective solution. Presently, the reliance on single-modal data predominates in plant disease research, which limits the scope for early and detailed diagnosis. Consequently, developing text modality generation techniques is essential for overcoming the limitations in plant disease recognition. To this end, we propose a method for aligning plant phenotypes with trait descriptions, which diagnoses text by progressively masking disease images. First, for training and validation, we annotate 5,728 disease phenotype images with expert diagnostic text and provide annotated text and trait labels for 210,000 disease images. Then, we propose a PhenoTrait text description model, which consists of global and heterogeneous feature encoders as well as switching-attention decoders, for accurate context-aware output. Next, to generate a more phenotypically appropriate description, we adopt 3 stages of embedding image features into semantic structures, which generate characterizations that preserve trait features. Finally, our experimental results show that our model outperforms several frontier models in multiple trait descriptions, including the larger models GPT-4 and GPT-4o. Our code and dataset are available at https://plantext.samlab.cn/.https://spj.science.org/doi/10.34133/plantphenomics.0272 |
| spellingShingle | Kejun Zhao Xingcai Wu Yuanyuan Xiao Sijun Jiang Peijia Yu Yazhou Wang Qi Wang PlanText: Gradually Masked Guidance to Align Image Phenotypes with Trait Descriptions for Plant Disease Texts Plant Phenomics |
| title | PlanText: Gradually Masked Guidance to Align Image Phenotypes with Trait Descriptions for Plant Disease Texts |
| title_full | PlanText: Gradually Masked Guidance to Align Image Phenotypes with Trait Descriptions for Plant Disease Texts |
| title_fullStr | PlanText: Gradually Masked Guidance to Align Image Phenotypes with Trait Descriptions for Plant Disease Texts |
| title_full_unstemmed | PlanText: Gradually Masked Guidance to Align Image Phenotypes with Trait Descriptions for Plant Disease Texts |
| title_short | PlanText: Gradually Masked Guidance to Align Image Phenotypes with Trait Descriptions for Plant Disease Texts |
| title_sort | plantext gradually masked guidance to align image phenotypes with trait descriptions for plant disease texts |
| url | https://spj.science.org/doi/10.34133/plantphenomics.0272 |
| work_keys_str_mv | AT kejunzhao plantextgraduallymaskedguidancetoalignimagephenotypeswithtraitdescriptionsforplantdiseasetexts AT xingcaiwu plantextgraduallymaskedguidancetoalignimagephenotypeswithtraitdescriptionsforplantdiseasetexts AT yuanyuanxiao plantextgraduallymaskedguidancetoalignimagephenotypeswithtraitdescriptionsforplantdiseasetexts AT sijunjiang plantextgraduallymaskedguidancetoalignimagephenotypeswithtraitdescriptionsforplantdiseasetexts AT peijiayu plantextgraduallymaskedguidancetoalignimagephenotypeswithtraitdescriptionsforplantdiseasetexts AT yazhouwang plantextgraduallymaskedguidancetoalignimagephenotypeswithtraitdescriptionsforplantdiseasetexts AT qiwang plantextgraduallymaskedguidancetoalignimagephenotypeswithtraitdescriptionsforplantdiseasetexts |