SuperMOT: Decoupling Motion and Fusing Temporal Pyramid Features for UAV Multiobject Tracking

Multiobject tracking in uncrewed aerial vehicle (UAV) videos is a critical research topic with applications in various fields. However, traditional multiobject trackers demonstrate limited generalization in UAV videos due to challenges such as nonlinear object motion caused by UAV movement and the m...

Full description

Saved in:
Bibliographic Details
Main Authors: Libo Ren, Wenxin Yin, Wenhui Diao, Kun Fu, Xian Sun
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10972048/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850130480275390464
author Libo Ren
Wenxin Yin
Wenhui Diao
Kun Fu
Xian Sun
author_facet Libo Ren
Wenxin Yin
Wenhui Diao
Kun Fu
Xian Sun
author_sort Libo Ren
collection DOAJ
description Multiobject tracking in uncrewed aerial vehicle (UAV) videos is a critical research topic with applications in various fields. However, traditional multiobject trackers demonstrate limited generalization in UAV videos due to challenges such as nonlinear object motion caused by UAV movement and the multiscale appearance of small objects caused by the oblique imaging angles of UAVs. Therefore, this article proposes a novel method called SuperMOT, specifically designed for multiobject tracking in UAV videos. SuperMOT proposes a pyramid deformable alignment (PDA) module, which aligns and fuses temporal semantic features across multiple levels to enhance object recognition. It can be observed that the input sizes commonly selected by mainstream UAV multiobject trackers to reduce computational load are suboptimal for capturing the semantic features of small objects. To address this limitation, a lightweight video super-resolution (VSR) module is integrated into the training pipeline to incorporate high-resolution information. Furthermore, to address complex motions in UAV view, a motion decoupling (MD) module is developed to separately process object motion and platform motion. Experimental results on the VisDrone2019 and UAVDT datasets indicate the effectiveness and real-time capability of the SuperMOT, achieving state-of-the-art performance.
format Article
id doaj-art-c2260a3314ec4597bb6658125aa4bb58
institution OA Journals
issn 1939-1404
2151-1535
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling doaj-art-c2260a3314ec4597bb6658125aa4bb582025-08-20T02:32:41ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352025-01-0118141881420210.1109/JSTARS.2025.356306010972048SuperMOT: Decoupling Motion and Fusing Temporal Pyramid Features for UAV Multiobject TrackingLibo Ren0https://orcid.org/0009-0006-0375-183XWenxin Yin1https://orcid.org/0000-0003-0157-7947Wenhui Diao2https://orcid.org/0000-0002-3931-3974Kun Fu3https://orcid.org/0000-0002-0450-6469Xian Sun4https://orcid.org/0000-0002-0038-9816Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaMultiobject tracking in uncrewed aerial vehicle (UAV) videos is a critical research topic with applications in various fields. However, traditional multiobject trackers demonstrate limited generalization in UAV videos due to challenges such as nonlinear object motion caused by UAV movement and the multiscale appearance of small objects caused by the oblique imaging angles of UAVs. Therefore, this article proposes a novel method called SuperMOT, specifically designed for multiobject tracking in UAV videos. SuperMOT proposes a pyramid deformable alignment (PDA) module, which aligns and fuses temporal semantic features across multiple levels to enhance object recognition. It can be observed that the input sizes commonly selected by mainstream UAV multiobject trackers to reduce computational load are suboptimal for capturing the semantic features of small objects. To address this limitation, a lightweight video super-resolution (VSR) module is integrated into the training pipeline to incorporate high-resolution information. Furthermore, to address complex motions in UAV view, a motion decoupling (MD) module is developed to separately process object motion and platform motion. Experimental results on the VisDrone2019 and UAVDT datasets indicate the effectiveness and real-time capability of the SuperMOT, achieving state-of-the-art performance.https://ieeexplore.ieee.org/document/10972048/Multiobject tracking (MOT)super resolution (SR)temporal feature fusionuncrewed aerial vehicle (UAV) video
spellingShingle Libo Ren
Wenxin Yin
Wenhui Diao
Kun Fu
Xian Sun
SuperMOT: Decoupling Motion and Fusing Temporal Pyramid Features for UAV Multiobject Tracking
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Multiobject tracking (MOT)
super resolution (SR)
temporal feature fusion
uncrewed aerial vehicle (UAV) video
title SuperMOT: Decoupling Motion and Fusing Temporal Pyramid Features for UAV Multiobject Tracking
title_full SuperMOT: Decoupling Motion and Fusing Temporal Pyramid Features for UAV Multiobject Tracking
title_fullStr SuperMOT: Decoupling Motion and Fusing Temporal Pyramid Features for UAV Multiobject Tracking
title_full_unstemmed SuperMOT: Decoupling Motion and Fusing Temporal Pyramid Features for UAV Multiobject Tracking
title_short SuperMOT: Decoupling Motion and Fusing Temporal Pyramid Features for UAV Multiobject Tracking
title_sort supermot decoupling motion and fusing temporal pyramid features for uav multiobject tracking
topic Multiobject tracking (MOT)
super resolution (SR)
temporal feature fusion
uncrewed aerial vehicle (UAV) video
url https://ieeexplore.ieee.org/document/10972048/
work_keys_str_mv AT liboren supermotdecouplingmotionandfusingtemporalpyramidfeaturesforuavmultiobjecttracking
AT wenxinyin supermotdecouplingmotionandfusingtemporalpyramidfeaturesforuavmultiobjecttracking
AT wenhuidiao supermotdecouplingmotionandfusingtemporalpyramidfeaturesforuavmultiobjecttracking
AT kunfu supermotdecouplingmotionandfusingtemporalpyramidfeaturesforuavmultiobjecttracking
AT xiansun supermotdecouplingmotionandfusingtemporalpyramidfeaturesforuavmultiobjecttracking