LandNet: Combine CNN and Transformer to Learn Absolute Camera Pose for the Fixed-Wing Aircraft Approach and Landing
Camera localization approaches often degrade in challenging environments characterized by illumination variations and significant viewpoint changes, presenting critical limitations for fixed-wing aircraft landing applications. To address these challenges, we propose LandNet—a novel absolute camera p...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-02-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/17/4/653 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850081138870059008 |
|---|---|
| author | Siyuan Shen Guanfeng Yu Lei Zhang Youyu Yan Zhengjun Zhai |
| author_facet | Siyuan Shen Guanfeng Yu Lei Zhang Youyu Yan Zhengjun Zhai |
| author_sort | Siyuan Shen |
| collection | DOAJ |
| description | Camera localization approaches often degrade in challenging environments characterized by illumination variations and significant viewpoint changes, presenting critical limitations for fixed-wing aircraft landing applications. To address these challenges, we propose LandNet—a novel absolute camera pose estimation network specifically designed for airborne scenarios. Our framework processes images from forward-looking aircraft cameras to directly predict 6-DoF camera poses, subsequently enabling aircraft pose determination through rigid transformation. As a first step, we design two encoders from Transformer and CNNs to capture complementary spatial–temporal features. Furthermore, a novel <b>Feature Interactive Block (FIB)</b> is employed to fully utilize spatial clues from the CNN encoder and temporal clues from the Transformer encoder. We also introduce a novel Attentional Convtrans Fusion Block <b>(ACFB)</b> to fuse the feature maps from encoder and transformer encoder, which can enhance the image representations to promote the accuracy of the camera pose. Finally, two <b>Multi-Layer Perceptron (MLP)</b> heads are applied to estimate 6-DOF of camera position and orientation, respectively. Thus the estimated position and orientation of our LandNet can be further used to acquire the pose and orientation of the aircraft through the rigid connection between the airborne camera and the aircraft. The experimental results from simulation and real flight data demonstrate the effectiveness of our proposed method. |
| format | Article |
| id | doaj-art-aaa95f3bd91b4b5bac873a058db301e5 |
| institution | DOAJ |
| issn | 2072-4292 |
| language | English |
| publishDate | 2025-02-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Remote Sensing |
| spelling | doaj-art-aaa95f3bd91b4b5bac873a058db301e52025-08-20T02:44:47ZengMDPI AGRemote Sensing2072-42922025-02-0117465310.3390/rs17040653LandNet: Combine CNN and Transformer to Learn Absolute Camera Pose for the Fixed-Wing Aircraft Approach and LandingSiyuan Shen0Guanfeng Yu1Lei Zhang2Youyu Yan3Zhengjun Zhai4School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, ChinaAVIC Xi’an Aeronautics Computing Technique Research Institute, Xi’an 710068, ChinaAVIC Xi’an Aeronautics Computing Technique Research Institute, Xi’an 710068, ChinaSchool of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, ChinaSchool of Computer Science, Northwestern Polytechnical University, Xi’an 710072, ChinaCamera localization approaches often degrade in challenging environments characterized by illumination variations and significant viewpoint changes, presenting critical limitations for fixed-wing aircraft landing applications. To address these challenges, we propose LandNet—a novel absolute camera pose estimation network specifically designed for airborne scenarios. Our framework processes images from forward-looking aircraft cameras to directly predict 6-DoF camera poses, subsequently enabling aircraft pose determination through rigid transformation. As a first step, we design two encoders from Transformer and CNNs to capture complementary spatial–temporal features. Furthermore, a novel <b>Feature Interactive Block (FIB)</b> is employed to fully utilize spatial clues from the CNN encoder and temporal clues from the Transformer encoder. We also introduce a novel Attentional Convtrans Fusion Block <b>(ACFB)</b> to fuse the feature maps from encoder and transformer encoder, which can enhance the image representations to promote the accuracy of the camera pose. Finally, two <b>Multi-Layer Perceptron (MLP)</b> heads are applied to estimate 6-DOF of camera position and orientation, respectively. Thus the estimated position and orientation of our LandNet can be further used to acquire the pose and orientation of the aircraft through the rigid connection between the airborne camera and the aircraft. The experimental results from simulation and real flight data demonstrate the effectiveness of our proposed method.https://www.mdpi.com/2072-4292/17/4/653absolute camera regressiontransformerfixed-wing aircraft landing |
| spellingShingle | Siyuan Shen Guanfeng Yu Lei Zhang Youyu Yan Zhengjun Zhai LandNet: Combine CNN and Transformer to Learn Absolute Camera Pose for the Fixed-Wing Aircraft Approach and Landing Remote Sensing absolute camera regression transformer fixed-wing aircraft landing |
| title | LandNet: Combine CNN and Transformer to Learn Absolute Camera Pose for the Fixed-Wing Aircraft Approach and Landing |
| title_full | LandNet: Combine CNN and Transformer to Learn Absolute Camera Pose for the Fixed-Wing Aircraft Approach and Landing |
| title_fullStr | LandNet: Combine CNN and Transformer to Learn Absolute Camera Pose for the Fixed-Wing Aircraft Approach and Landing |
| title_full_unstemmed | LandNet: Combine CNN and Transformer to Learn Absolute Camera Pose for the Fixed-Wing Aircraft Approach and Landing |
| title_short | LandNet: Combine CNN and Transformer to Learn Absolute Camera Pose for the Fixed-Wing Aircraft Approach and Landing |
| title_sort | landnet combine cnn and transformer to learn absolute camera pose for the fixed wing aircraft approach and landing |
| topic | absolute camera regression transformer fixed-wing aircraft landing |
| url | https://www.mdpi.com/2072-4292/17/4/653 |
| work_keys_str_mv | AT siyuanshen landnetcombinecnnandtransformertolearnabsolutecameraposeforthefixedwingaircraftapproachandlanding AT guanfengyu landnetcombinecnnandtransformertolearnabsolutecameraposeforthefixedwingaircraftapproachandlanding AT leizhang landnetcombinecnnandtransformertolearnabsolutecameraposeforthefixedwingaircraftapproachandlanding AT youyuyan landnetcombinecnnandtransformertolearnabsolutecameraposeforthefixedwingaircraftapproachandlanding AT zhengjunzhai landnetcombinecnnandtransformertolearnabsolutecameraposeforthefixedwingaircraftapproachandlanding |