Optimized DINO model for accurate object detection of sesame seedlings and weeds
Abstract The application of intelligent agricultural machinery is crucial in modern agricultural production. However, in environments where the target and the surrounding morphology are highly similar, such as distinguishing sesame seedlings from weeds, the problem essentially becomes one of optimiz...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-04-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-96826-6 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract The application of intelligent agricultural machinery is crucial in modern agricultural production. However, in environments where the target and the surrounding morphology are highly similar, such as distinguishing sesame seedlings from weeds, the problem essentially becomes one of optimizing edge detection algorithms for similar targets. To address this issue in agricultural object detection, we developed a custom dataset containing 1,300 images of sesame seedlings and weeds. To overcome the high complexity and low detection accuracy limitations of the original DINO model for this problem, the backbone network was replaced with MobileNet V3, the SENet attention mechanism and neck structure were optimized, and the H-Swish6 activation function was introduced to suit edge devices. Given the higher degree of lignification in the stems of sesame seedlings, these modifications improved the overall Average Precision (AP) of the model on the COCO dataset by 5.1% compared to the original DINO model. Specifically, $$\text {AP}_{S}$$ and $$\text {AP}_{M}$$ increased by 3.3% and 3.8%, respectively, while $$\text {AP}_{50}$$ and $$\text {AP}_{75}$$ increased by 2.3% and 3.2%. The model’s parameter count was reduced to 29M, inference time was lowered by 60%, and computational cost in FLOPs decreased by 43.72%. To verify the effectiveness of the improvements, we developed a custom dataset containing 1,300 images of sesame seedlings and weeds. On this model, the improved DINO model achieved a maximum AP of 81.8%, outperforming the YOLOv7 model by 5.6%, with an FPS of 24 frames per second. Ablation experiments verified the effectiveness of the model improvements.However, the aforementioned studies have not addressed the issue of low detection accuracy in scenarios with similar targets in the agricultural domain. |
|---|---|
| ISSN: | 2045-2322 |