SuperMOT: Decoupling Motion and Fusing Temporal Pyramid Features for UAV Multiobject Tracking
Multiobject tracking in uncrewed aerial vehicle (UAV) videos is a critical research topic with applications in various fields. However, traditional multiobject trackers demonstrate limited generalization in UAV videos due to challenges such as nonlinear object motion caused by UAV movement and the m...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10972048/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850130480275390464 |
|---|---|
| author | Libo Ren Wenxin Yin Wenhui Diao Kun Fu Xian Sun |
| author_facet | Libo Ren Wenxin Yin Wenhui Diao Kun Fu Xian Sun |
| author_sort | Libo Ren |
| collection | DOAJ |
| description | Multiobject tracking in uncrewed aerial vehicle (UAV) videos is a critical research topic with applications in various fields. However, traditional multiobject trackers demonstrate limited generalization in UAV videos due to challenges such as nonlinear object motion caused by UAV movement and the multiscale appearance of small objects caused by the oblique imaging angles of UAVs. Therefore, this article proposes a novel method called SuperMOT, specifically designed for multiobject tracking in UAV videos. SuperMOT proposes a pyramid deformable alignment (PDA) module, which aligns and fuses temporal semantic features across multiple levels to enhance object recognition. It can be observed that the input sizes commonly selected by mainstream UAV multiobject trackers to reduce computational load are suboptimal for capturing the semantic features of small objects. To address this limitation, a lightweight video super-resolution (VSR) module is integrated into the training pipeline to incorporate high-resolution information. Furthermore, to address complex motions in UAV view, a motion decoupling (MD) module is developed to separately process object motion and platform motion. Experimental results on the VisDrone2019 and UAVDT datasets indicate the effectiveness and real-time capability of the SuperMOT, achieving state-of-the-art performance. |
| format | Article |
| id | doaj-art-c2260a3314ec4597bb6658125aa4bb58 |
| institution | OA Journals |
| issn | 1939-1404 2151-1535 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
| spelling | doaj-art-c2260a3314ec4597bb6658125aa4bb582025-08-20T02:32:41ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352025-01-0118141881420210.1109/JSTARS.2025.356306010972048SuperMOT: Decoupling Motion and Fusing Temporal Pyramid Features for UAV Multiobject TrackingLibo Ren0https://orcid.org/0009-0006-0375-183XWenxin Yin1https://orcid.org/0000-0003-0157-7947Wenhui Diao2https://orcid.org/0000-0002-3931-3974Kun Fu3https://orcid.org/0000-0002-0450-6469Xian Sun4https://orcid.org/0000-0002-0038-9816Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaMultiobject tracking in uncrewed aerial vehicle (UAV) videos is a critical research topic with applications in various fields. However, traditional multiobject trackers demonstrate limited generalization in UAV videos due to challenges such as nonlinear object motion caused by UAV movement and the multiscale appearance of small objects caused by the oblique imaging angles of UAVs. Therefore, this article proposes a novel method called SuperMOT, specifically designed for multiobject tracking in UAV videos. SuperMOT proposes a pyramid deformable alignment (PDA) module, which aligns and fuses temporal semantic features across multiple levels to enhance object recognition. It can be observed that the input sizes commonly selected by mainstream UAV multiobject trackers to reduce computational load are suboptimal for capturing the semantic features of small objects. To address this limitation, a lightweight video super-resolution (VSR) module is integrated into the training pipeline to incorporate high-resolution information. Furthermore, to address complex motions in UAV view, a motion decoupling (MD) module is developed to separately process object motion and platform motion. Experimental results on the VisDrone2019 and UAVDT datasets indicate the effectiveness and real-time capability of the SuperMOT, achieving state-of-the-art performance.https://ieeexplore.ieee.org/document/10972048/Multiobject tracking (MOT)super resolution (SR)temporal feature fusionuncrewed aerial vehicle (UAV) video |
| spellingShingle | Libo Ren Wenxin Yin Wenhui Diao Kun Fu Xian Sun SuperMOT: Decoupling Motion and Fusing Temporal Pyramid Features for UAV Multiobject Tracking IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Multiobject tracking (MOT) super resolution (SR) temporal feature fusion uncrewed aerial vehicle (UAV) video |
| title | SuperMOT: Decoupling Motion and Fusing Temporal Pyramid Features for UAV Multiobject Tracking |
| title_full | SuperMOT: Decoupling Motion and Fusing Temporal Pyramid Features for UAV Multiobject Tracking |
| title_fullStr | SuperMOT: Decoupling Motion and Fusing Temporal Pyramid Features for UAV Multiobject Tracking |
| title_full_unstemmed | SuperMOT: Decoupling Motion and Fusing Temporal Pyramid Features for UAV Multiobject Tracking |
| title_short | SuperMOT: Decoupling Motion and Fusing Temporal Pyramid Features for UAV Multiobject Tracking |
| title_sort | supermot decoupling motion and fusing temporal pyramid features for uav multiobject tracking |
| topic | Multiobject tracking (MOT) super resolution (SR) temporal feature fusion uncrewed aerial vehicle (UAV) video |
| url | https://ieeexplore.ieee.org/document/10972048/ |
| work_keys_str_mv | AT liboren supermotdecouplingmotionandfusingtemporalpyramidfeaturesforuavmultiobjecttracking AT wenxinyin supermotdecouplingmotionandfusingtemporalpyramidfeaturesforuavmultiobjecttracking AT wenhuidiao supermotdecouplingmotionandfusingtemporalpyramidfeaturesforuavmultiobjecttracking AT kunfu supermotdecouplingmotionandfusingtemporalpyramidfeaturesforuavmultiobjecttracking AT xiansun supermotdecouplingmotionandfusingtemporalpyramidfeaturesforuavmultiobjecttracking |