Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images
Anomaly detection is a process in which outlier samples can be detected in a given dataset. The purpose of this study is to implement, test, and evaluate the possibility of using deep learning methods for outlier detection with the use of a fine-tuning approach. A Transformer Masked Autoencoder was...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/11/6286 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850129479863631872 |
|---|---|
| author | Jakub Gajda Joanna Kwiecień |
| author_facet | Jakub Gajda Joanna Kwiecień |
| author_sort | Jakub Gajda |
| collection | DOAJ |
| description | Anomaly detection is a process in which outlier samples can be detected in a given dataset. The purpose of this study is to implement, test, and evaluate the possibility of using deep learning methods for outlier detection with the use of a fine-tuning approach. A Transformer Masked Autoencoder was fine-tuned for a custom satellite image dataset after being pre-trained on the ImageNet subset. The first process of training included building an internal representation of images from a normal class. After adjusting the model weights for this task, a custom dataset with normal and abnormal samples was used for the reconstruction error calculation. The results obtained in this study show that it is possible to distinguish between normal class representatives and outliers using the proposed approach. However, this is not sufficient for the model to be employed in real-life applications. With a given level of precision, the model requires additional knowledge about the subject to correctly classify the sample. To the best of our knowledge, this study is the first to apply ViTMAE for a custom satellite image database. An analysis of the misclassified samples shows that the model tends to generalize the image content and is not sufficiently robust for image noise. As a result of the analysis, a new anomaly indicator is proposed for further study. |
| format | Article |
| id | doaj-art-bb4560e4821a45f092bebe5280015d18 |
| institution | OA Journals |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-bb4560e4821a45f092bebe5280015d182025-08-20T02:32:57ZengMDPI AGApplied Sciences2076-34172025-06-011511628610.3390/app15116286Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite ImagesJakub Gajda0Joanna Kwiecień1Department of Automatic Control and Robotics, AGH University of Krakow, al. Mickiewicza 30, 30-059 Krakow, PolandDepartment of Automatic Control and Robotics, AGH University of Krakow, al. Mickiewicza 30, 30-059 Krakow, PolandAnomaly detection is a process in which outlier samples can be detected in a given dataset. The purpose of this study is to implement, test, and evaluate the possibility of using deep learning methods for outlier detection with the use of a fine-tuning approach. A Transformer Masked Autoencoder was fine-tuned for a custom satellite image dataset after being pre-trained on the ImageNet subset. The first process of training included building an internal representation of images from a normal class. After adjusting the model weights for this task, a custom dataset with normal and abnormal samples was used for the reconstruction error calculation. The results obtained in this study show that it is possible to distinguish between normal class representatives and outliers using the proposed approach. However, this is not sufficient for the model to be employed in real-life applications. With a given level of precision, the model requires additional knowledge about the subject to correctly classify the sample. To the best of our knowledge, this study is the first to apply ViTMAE for a custom satellite image database. An analysis of the misclassified samples shows that the model tends to generalize the image content and is not sufficiently robust for image noise. As a result of the analysis, a new anomaly indicator is proposed for further study.https://www.mdpi.com/2076-3417/15/11/6286transformer modelsautoencodersanomaly detectiondeep learningsatellite images |
| spellingShingle | Jakub Gajda Joanna Kwiecień Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images Applied Sciences transformer models autoencoders anomaly detection deep learning satellite images |
| title | Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images |
| title_full | Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images |
| title_fullStr | Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images |
| title_full_unstemmed | Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images |
| title_short | Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images |
| title_sort | fine tuned visual transformer masked autoencoder applied for anomaly detection in satellite images |
| topic | transformer models autoencoders anomaly detection deep learning satellite images |
| url | https://www.mdpi.com/2076-3417/15/11/6286 |
| work_keys_str_mv | AT jakubgajda finetunedvisualtransformermaskedautoencoderappliedforanomalydetectioninsatelliteimages AT joannakwiecien finetunedvisualtransformermaskedautoencoderappliedforanomalydetectioninsatelliteimages |