Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images

Anomaly detection is a process in which outlier samples can be detected in a given dataset. The purpose of this study is to implement, test, and evaluate the possibility of using deep learning methods for outlier detection with the use of a fine-tuning approach. A Transformer Masked Autoencoder was...

Full description

Saved in:
Bibliographic Details
Main Authors: Jakub Gajda, Joanna Kwiecień
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/11/6286
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850129479863631872
author Jakub Gajda
Joanna Kwiecień
author_facet Jakub Gajda
Joanna Kwiecień
author_sort Jakub Gajda
collection DOAJ
description Anomaly detection is a process in which outlier samples can be detected in a given dataset. The purpose of this study is to implement, test, and evaluate the possibility of using deep learning methods for outlier detection with the use of a fine-tuning approach. A Transformer Masked Autoencoder was fine-tuned for a custom satellite image dataset after being pre-trained on the ImageNet subset. The first process of training included building an internal representation of images from a normal class. After adjusting the model weights for this task, a custom dataset with normal and abnormal samples was used for the reconstruction error calculation. The results obtained in this study show that it is possible to distinguish between normal class representatives and outliers using the proposed approach. However, this is not sufficient for the model to be employed in real-life applications. With a given level of precision, the model requires additional knowledge about the subject to correctly classify the sample. To the best of our knowledge, this study is the first to apply ViTMAE for a custom satellite image database. An analysis of the misclassified samples shows that the model tends to generalize the image content and is not sufficiently robust for image noise. As a result of the analysis, a new anomaly indicator is proposed for further study.
format Article
id doaj-art-bb4560e4821a45f092bebe5280015d18
institution OA Journals
issn 2076-3417
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-bb4560e4821a45f092bebe5280015d182025-08-20T02:32:57ZengMDPI AGApplied Sciences2076-34172025-06-011511628610.3390/app15116286Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite ImagesJakub Gajda0Joanna Kwiecień1Department of Automatic Control and Robotics, AGH University of Krakow, al. Mickiewicza 30, 30-059 Krakow, PolandDepartment of Automatic Control and Robotics, AGH University of Krakow, al. Mickiewicza 30, 30-059 Krakow, PolandAnomaly detection is a process in which outlier samples can be detected in a given dataset. The purpose of this study is to implement, test, and evaluate the possibility of using deep learning methods for outlier detection with the use of a fine-tuning approach. A Transformer Masked Autoencoder was fine-tuned for a custom satellite image dataset after being pre-trained on the ImageNet subset. The first process of training included building an internal representation of images from a normal class. After adjusting the model weights for this task, a custom dataset with normal and abnormal samples was used for the reconstruction error calculation. The results obtained in this study show that it is possible to distinguish between normal class representatives and outliers using the proposed approach. However, this is not sufficient for the model to be employed in real-life applications. With a given level of precision, the model requires additional knowledge about the subject to correctly classify the sample. To the best of our knowledge, this study is the first to apply ViTMAE for a custom satellite image database. An analysis of the misclassified samples shows that the model tends to generalize the image content and is not sufficiently robust for image noise. As a result of the analysis, a new anomaly indicator is proposed for further study.https://www.mdpi.com/2076-3417/15/11/6286transformer modelsautoencodersanomaly detectiondeep learningsatellite images
spellingShingle Jakub Gajda
Joanna Kwiecień
Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images
Applied Sciences
transformer models
autoencoders
anomaly detection
deep learning
satellite images
title Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images
title_full Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images
title_fullStr Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images
title_full_unstemmed Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images
title_short Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images
title_sort fine tuned visual transformer masked autoencoder applied for anomaly detection in satellite images
topic transformer models
autoencoders
anomaly detection
deep learning
satellite images
url https://www.mdpi.com/2076-3417/15/11/6286
work_keys_str_mv AT jakubgajda finetunedvisualtransformermaskedautoencoderappliedforanomalydetectioninsatelliteimages
AT joannakwiecien finetunedvisualtransformermaskedautoencoderappliedforanomalydetectioninsatelliteimages