Large-scale tobacco identification via a very-high-resolution unmanned aerial vehicle benchmark and a ConvFlow Transformer
Remote sensing and artificial intelligence technology have propelled the development of precision agriculture and smart agriculture. Among them, as a crucial economic crop, tobacco has been rarely studied and its large-scale identification task has consistently encountered several challenges. Firstl...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-05-01
|
| Series: | International Journal of Applied Earth Observations and Geoinformation |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S1569843225001967 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Remote sensing and artificial intelligence technology have propelled the development of precision agriculture and smart agriculture. Among them, as a crucial economic crop, tobacco has been rarely studied and its large-scale identification task has consistently encountered several challenges. Firstly, tobacco is often inter-cropped with other crops, such as corn. These crops have similar colors and textures, with only minor differences in planting spacing and arrangement. These slight differences become even less observable in remote sensing imagery. Secondly, tobacco growth is a continuous and evolving process, resulting in drastically different characteristics during various growth stages and seasons, which further complicates the task of identification. Moreover, to the best of our knowledge, no tobacco dataset is accessible to the public, impeding the development of a deep learning (DL) model with optimal performance. Therefore, a Large-scale UAV remote SEnsing Tobacco dataset (LUSET) which is the world’s first tobacco dataset with a total volume of 67GB has been conducted in this paper. 10 large-scale images in the LUSET are accurately annotated with an average resolution of about 20,000 × 20,000 pixels, which can be divided into 7,252 512 × 512 samples11 https://github.com/Monkeycrop/UAV-Tobacco-Dataset.. Then, a dual-branch ConvFlow Transformer is proposed to address tobacco’s rich diversity and high inter-class similarity among different crops. A novel Convolutional Feature-enhanced Multi-Head Self-attention (CF-MHSA) with a location-free design in the ConvFlow Transformer is developed to replace the value matrix in the standard attention with the convolutional multi-scale features, which effectively achieves feature interaction and fusion from the convolutional and transformer branches. The fusion of refined features allows us to better distinguish the texture characteristics of different crops and represent their morphological features during different growth cycles. This addresses the two major challenges in tobacco recognition. Extensive experiments on the UAV tobacco data proved that the strategy of ConvFlow Transformer can be easily achieved in the mainstream Transformers and significantly improve their performance in tobacco identification with a small amount of computation. |
|---|---|
| ISSN: | 1569-8432 |