Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images

Anomaly detection is a process in which outlier samples can be detected in a given dataset. The purpose of this study is to implement, test, and evaluate the possibility of using deep learning methods for outlier detection with the use of a fine-tuning approach. A Transformer Masked Autoencoder was...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jakub Gajda, Joanna Kwiecień
Format:	Article
Language:	English
Published:	MDPI AG 2025-06-01
Series:	Applied Sciences
Subjects:	transformer models autoencoders anomaly detection deep learning satellite images
Online Access:	https://www.mdpi.com/2076-3417/15/11/6286
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850129479863631872
author	Jakub Gajda Joanna Kwiecień
author_facet	Jakub Gajda Joanna Kwiecień
author_sort	Jakub Gajda
collection	DOAJ
description	Anomaly detection is a process in which outlier samples can be detected in a given dataset. The purpose of this study is to implement, test, and evaluate the possibility of using deep learning methods for outlier detection with the use of a fine-tuning approach. A Transformer Masked Autoencoder was fine-tuned for a custom satellite image dataset after being pre-trained on the ImageNet subset. The first process of training included building an internal representation of images from a normal class. After adjusting the model weights for this task, a custom dataset with normal and abnormal samples was used for the reconstruction error calculation. The results obtained in this study show that it is possible to distinguish between normal class representatives and outliers using the proposed approach. However, this is not sufficient for the model to be employed in real-life applications. With a given level of precision, the model requires additional knowledge about the subject to correctly classify the sample. To the best of our knowledge, this study is the first to apply ViTMAE for a custom satellite image database. An analysis of the misclassified samples shows that the model tends to generalize the image content and is not sufficiently robust for image noise. As a result of the analysis, a new anomaly indicator is proposed for further study.
format	Article
id	doaj-art-bb4560e4821a45f092bebe5280015d18
institution	OA Journals
issn	2076-3417
language	English
publishDate	2025-06-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj-art-bb4560e4821a45f092bebe5280015d182025-08-20T02:32:57ZengMDPI AGApplied Sciences2076-34172025-06-011511628610.3390/app15116286Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite ImagesJakub Gajda0Joanna Kwiecień1Department of Automatic Control and Robotics, AGH University of Krakow, al. Mickiewicza 30, 30-059 Krakow, PolandDepartment of Automatic Control and Robotics, AGH University of Krakow, al. Mickiewicza 30, 30-059 Krakow, PolandAnomaly detection is a process in which outlier samples can be detected in a given dataset. The purpose of this study is to implement, test, and evaluate the possibility of using deep learning methods for outlier detection with the use of a fine-tuning approach. A Transformer Masked Autoencoder was fine-tuned for a custom satellite image dataset after being pre-trained on the ImageNet subset. The first process of training included building an internal representation of images from a normal class. After adjusting the model weights for this task, a custom dataset with normal and abnormal samples was used for the reconstruction error calculation. The results obtained in this study show that it is possible to distinguish between normal class representatives and outliers using the proposed approach. However, this is not sufficient for the model to be employed in real-life applications. With a given level of precision, the model requires additional knowledge about the subject to correctly classify the sample. To the best of our knowledge, this study is the first to apply ViTMAE for a custom satellite image database. An analysis of the misclassified samples shows that the model tends to generalize the image content and is not sufficiently robust for image noise. As a result of the analysis, a new anomaly indicator is proposed for further study.https://www.mdpi.com/2076-3417/15/11/6286transformer modelsautoencodersanomaly detectiondeep learningsatellite images
spellingShingle	Jakub Gajda Joanna Kwiecień Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images Applied Sciences transformer models autoencoders anomaly detection deep learning satellite images
title	Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images
title_full	Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images
title_fullStr	Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images
title_full_unstemmed	Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images
title_short	Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images
title_sort	fine tuned visual transformer masked autoencoder applied for anomaly detection in satellite images
topic	transformer models autoencoders anomaly detection deep learning satellite images
url	https://www.mdpi.com/2076-3417/15/11/6286
work_keys_str_mv	AT jakubgajda finetunedvisualtransformermaskedautoencoderappliedforanomalydetectioninsatelliteimages AT joannakwiecien finetunedvisualtransformermaskedautoencoderappliedforanomalydetectioninsatelliteimages

Fine-Tuned Visual Transformer Masked Autoencoder Applied for Anomaly Detection in Satellite Images

Similar Items