Lightweight Transformer with Adaptive Rotational Convolutions for Aerial Object Detection
Oriented object detection in aerial imagery presents unique challenges due to the arbitrary orientations, diverse scales, and limited availability of labeled data. In response to these issues, we propose RASST—a lightweight Rotationally Aware Semi-Supervised Transformer framework designed to achieve...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/9/5212 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850279237030772736 |
|---|---|
| author | Sabina Umirzakova Shakhnoza Muksimova Abrayeva Mahliyo Olimjon Qizi Young Im Cho |
| author_facet | Sabina Umirzakova Shakhnoza Muksimova Abrayeva Mahliyo Olimjon Qizi Young Im Cho |
| author_sort | Sabina Umirzakova |
| collection | DOAJ |
| description | Oriented object detection in aerial imagery presents unique challenges due to the arbitrary orientations, diverse scales, and limited availability of labeled data. In response to these issues, we propose RASST—a lightweight Rotationally Aware Semi-Supervised Transformer framework designed to achieve high-precision detection under fully and semi-supervised conditions. RASST integrates a hybrid Vision Transformer architecture augmented with rotationally aware patch embeddings, adaptive rotational convolutions, and a multi-scale feature fusion (MSFF) module that employs cross-scale attention to enhance detection across object sizes. To address the scarcity of labeled data, we introduce a novel Pseudo-Label Guided Learning (PGL) framework, which refines pseudo-labels through Rotation-Aware Adaptive Weighting (RAW) and Global Consistency (GC) losses, thereby improving generalization and robustness against noisy supervision. Despite its lightweight design, RASST achieves superior performance on the DOTA-v1.5 benchmark, outperforming existing state-of-the-art methods in supervised and semi-supervised settings. The proposed framework demonstrates high scalability, precise orientation sensitivity, and effective utilization of unlabeled data, establishing a new benchmark for efficient oriented object detection in remote sensing imagery. |
| format | Article |
| id | doaj-art-dac1f884bfbf406c8b830caa31ed3698 |
| institution | OA Journals |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-dac1f884bfbf406c8b830caa31ed36982025-08-20T01:49:10ZengMDPI AGApplied Sciences2076-34172025-05-01159521210.3390/app15095212Lightweight Transformer with Adaptive Rotational Convolutions for Aerial Object DetectionSabina Umirzakova0Shakhnoza Muksimova1Abrayeva Mahliyo Olimjon Qizi2Young Im Cho3Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Republic of KoreaDepartment of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Republic of KoreaDepartment of “Information Systems and Technologies”, Tashkent State University of Economics, Tashkent 100066, UzbekistanDepartment of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Republic of KoreaOriented object detection in aerial imagery presents unique challenges due to the arbitrary orientations, diverse scales, and limited availability of labeled data. In response to these issues, we propose RASST—a lightweight Rotationally Aware Semi-Supervised Transformer framework designed to achieve high-precision detection under fully and semi-supervised conditions. RASST integrates a hybrid Vision Transformer architecture augmented with rotationally aware patch embeddings, adaptive rotational convolutions, and a multi-scale feature fusion (MSFF) module that employs cross-scale attention to enhance detection across object sizes. To address the scarcity of labeled data, we introduce a novel Pseudo-Label Guided Learning (PGL) framework, which refines pseudo-labels through Rotation-Aware Adaptive Weighting (RAW) and Global Consistency (GC) losses, thereby improving generalization and robustness against noisy supervision. Despite its lightweight design, RASST achieves superior performance on the DOTA-v1.5 benchmark, outperforming existing state-of-the-art methods in supervised and semi-supervised settings. The proposed framework demonstrates high scalability, precise orientation sensitivity, and effective utilization of unlabeled data, establishing a new benchmark for efficient oriented object detection in remote sensing imagery.https://www.mdpi.com/2076-3417/15/9/5212lightweight vision transformeroriented object detectionsemi-supervised learningrotational invariancepseudo-labelingadaptive rotational convolution |
| spellingShingle | Sabina Umirzakova Shakhnoza Muksimova Abrayeva Mahliyo Olimjon Qizi Young Im Cho Lightweight Transformer with Adaptive Rotational Convolutions for Aerial Object Detection Applied Sciences lightweight vision transformer oriented object detection semi-supervised learning rotational invariance pseudo-labeling adaptive rotational convolution |
| title | Lightweight Transformer with Adaptive Rotational Convolutions for Aerial Object Detection |
| title_full | Lightweight Transformer with Adaptive Rotational Convolutions for Aerial Object Detection |
| title_fullStr | Lightweight Transformer with Adaptive Rotational Convolutions for Aerial Object Detection |
| title_full_unstemmed | Lightweight Transformer with Adaptive Rotational Convolutions for Aerial Object Detection |
| title_short | Lightweight Transformer with Adaptive Rotational Convolutions for Aerial Object Detection |
| title_sort | lightweight transformer with adaptive rotational convolutions for aerial object detection |
| topic | lightweight vision transformer oriented object detection semi-supervised learning rotational invariance pseudo-labeling adaptive rotational convolution |
| url | https://www.mdpi.com/2076-3417/15/9/5212 |
| work_keys_str_mv | AT sabinaumirzakova lightweighttransformerwithadaptiverotationalconvolutionsforaerialobjectdetection AT shakhnozamuksimova lightweighttransformerwithadaptiverotationalconvolutionsforaerialobjectdetection AT abrayevamahliyoolimjonqizi lightweighttransformerwithadaptiverotationalconvolutionsforaerialobjectdetection AT youngimcho lightweighttransformerwithadaptiverotationalconvolutionsforaerialobjectdetection |