CMDN: Pre-Trained Visual Representations Boost Adversarial Robustness for UAV Tracking

Visual object tracking is widely adopted to unmanned aerial vehicle (UAV)-related applications, which demand reliable tracking precision and real-time performance. However, UAV trackers are highly susceptible to adversarial attacks, while research on developing effective adversarial defense methods...

Full description

Saved in:
Bibliographic Details
Main Authors: Ruilong Yu, Zhewei Wu, Qihe Liu, Shijie Zhou, Min Gou, Bingchen Xiang
Format: Article
Language:English
Published: MDPI AG 2024-10-01
Series:Drones
Subjects:
Online Access:https://www.mdpi.com/2504-446X/8/11/607
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850145185749532672
author Ruilong Yu
Zhewei Wu
Qihe Liu
Shijie Zhou
Min Gou
Bingchen Xiang
author_facet Ruilong Yu
Zhewei Wu
Qihe Liu
Shijie Zhou
Min Gou
Bingchen Xiang
author_sort Ruilong Yu
collection DOAJ
description Visual object tracking is widely adopted to unmanned aerial vehicle (UAV)-related applications, which demand reliable tracking precision and real-time performance. However, UAV trackers are highly susceptible to adversarial attacks, while research on developing effective adversarial defense methods for UAV tracking remains limited. To tackle these challenges, we propose CMDN, a novel pre-processing defense network that effectively purifies adversarial perturbations by reconstructing video frames. This network learns robust visual representations from video frames, guided by meaningful features from both the search region and the template. Comprehensive experiments on three benchmarks demonstrate that CMDN is capable of enhancing a UAV tracker’s adversarial robustness in both adaptive and non-adaptive attack scenarios. In addition, CMDN maintains stable defense effectiveness when transferred to heterogeneous trackers. Real-world tests on the UAV platform also validate its reliable defense effectiveness and real-time performance, with CMDN achieving 27 FPS on NVIDIA Jetson Orin 16 GB (25 W mode).
format Article
id doaj-art-56f3d39bfad2430f9c4a7ac9f68ff07f
institution OA Journals
issn 2504-446X
language English
publishDate 2024-10-01
publisher MDPI AG
record_format Article
series Drones
spelling doaj-art-56f3d39bfad2430f9c4a7ac9f68ff07f2025-08-20T02:28:09ZengMDPI AGDrones2504-446X2024-10-0181160710.3390/drones8110607CMDN: Pre-Trained Visual Representations Boost Adversarial Robustness for UAV TrackingRuilong Yu0Zhewei Wu1Qihe Liu2Shijie Zhou3Min Gou4Bingchen Xiang5School of Information and Software Engineering, University of Electronic Science and Technology of China, No. 4, Section 2 Jianshebei Road, Chengdu 610000, ChinaSchool of Information and Software Engineering, University of Electronic Science and Technology of China, No. 4, Section 2 Jianshebei Road, Chengdu 610000, ChinaSchool of Information and Software Engineering, University of Electronic Science and Technology of China, No. 4, Section 2 Jianshebei Road, Chengdu 610000, ChinaSchool of Information and Software Engineering, University of Electronic Science and Technology of China, No. 4, Section 2 Jianshebei Road, Chengdu 610000, ChinaSchool of Information and Software Engineering, University of Electronic Science and Technology of China, No. 4, Section 2 Jianshebei Road, Chengdu 610000, ChinaSchool of Information and Software Engineering, University of Electronic Science and Technology of China, No. 4, Section 2 Jianshebei Road, Chengdu 610000, ChinaVisual object tracking is widely adopted to unmanned aerial vehicle (UAV)-related applications, which demand reliable tracking precision and real-time performance. However, UAV trackers are highly susceptible to adversarial attacks, while research on developing effective adversarial defense methods for UAV tracking remains limited. To tackle these challenges, we propose CMDN, a novel pre-processing defense network that effectively purifies adversarial perturbations by reconstructing video frames. This network learns robust visual representations from video frames, guided by meaningful features from both the search region and the template. Comprehensive experiments on three benchmarks demonstrate that CMDN is capable of enhancing a UAV tracker’s adversarial robustness in both adaptive and non-adaptive attack scenarios. In addition, CMDN maintains stable defense effectiveness when transferred to heterogeneous trackers. Real-world tests on the UAV platform also validate its reliable defense effectiveness and real-time performance, with CMDN achieving 27 FPS on NVIDIA Jetson Orin 16 GB (25 W mode).https://www.mdpi.com/2504-446X/8/11/607unmanned aerial vehicleadversarial defensevisual object tracking
spellingShingle Ruilong Yu
Zhewei Wu
Qihe Liu
Shijie Zhou
Min Gou
Bingchen Xiang
CMDN: Pre-Trained Visual Representations Boost Adversarial Robustness for UAV Tracking
Drones
unmanned aerial vehicle
adversarial defense
visual object tracking
title CMDN: Pre-Trained Visual Representations Boost Adversarial Robustness for UAV Tracking
title_full CMDN: Pre-Trained Visual Representations Boost Adversarial Robustness for UAV Tracking
title_fullStr CMDN: Pre-Trained Visual Representations Boost Adversarial Robustness for UAV Tracking
title_full_unstemmed CMDN: Pre-Trained Visual Representations Boost Adversarial Robustness for UAV Tracking
title_short CMDN: Pre-Trained Visual Representations Boost Adversarial Robustness for UAV Tracking
title_sort cmdn pre trained visual representations boost adversarial robustness for uav tracking
topic unmanned aerial vehicle
adversarial defense
visual object tracking
url https://www.mdpi.com/2504-446X/8/11/607
work_keys_str_mv AT ruilongyu cmdnpretrainedvisualrepresentationsboostadversarialrobustnessforuavtracking
AT zheweiwu cmdnpretrainedvisualrepresentationsboostadversarialrobustnessforuavtracking
AT qiheliu cmdnpretrainedvisualrepresentationsboostadversarialrobustnessforuavtracking
AT shijiezhou cmdnpretrainedvisualrepresentationsboostadversarialrobustnessforuavtracking
AT mingou cmdnpretrainedvisualrepresentationsboostadversarialrobustnessforuavtracking
AT bingchenxiang cmdnpretrainedvisualrepresentationsboostadversarialrobustnessforuavtracking