Memory-augment graph transformer based unsupervised detection model for identifying performance anomalies in highly-dynamic cloud environments

Abstract Cloud computing systems provide highly available and scalable computing, storage, and network resources to meet various service demands. Anomaly detection based on monitoring metrics plays a crucial role in identifying system defects and abnormal behaviors, ensuring the reliability and stab...

Full description

Saved in:
Bibliographic Details
Main Authors: Huangyining Gao, Ruyue Xin, Peng Chen, Xi Li, Ning Lu, Peng You
Format: Article
Language:English
Published: SpringerOpen 2025-07-01
Series:Journal of Cloud Computing: Advances, Systems and Applications
Subjects:
Online Access:https://doi.org/10.1186/s13677-025-00766-5
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849331766945382400
author Huangyining Gao
Ruyue Xin
Peng Chen
Xi Li
Ning Lu
Peng You
author_facet Huangyining Gao
Ruyue Xin
Peng Chen
Xi Li
Ning Lu
Peng You
author_sort Huangyining Gao
collection DOAJ
description Abstract Cloud computing systems provide highly available and scalable computing, storage, and network resources to meet various service demands. Anomaly detection based on monitoring metrics plays a crucial role in identifying system defects and abnormal behaviors, ensuring the reliability and stability of cloud services. However, as the complexity of data increases and concurrent noise impacts cloud environments, server failures and abnormal events rise, making anomaly detection more challenging. To address these issues, we propose MemGT, an unsupervised multivariate time series anomaly detection method that offers high accuracy, good robustness, and noise resistance for various data patterns in complex cloud environments. Our approach utilizes a Transformer encoder and dynamic graph structure learning to extract spatio-temporal features of monitoring metrics in cloud computing systems in parallel. Additionally, we introduce a novel dynamic gated memory module to guide the Transformer encoder in extracting hidden features, thereby enhancing the model’s robustness to varying data patterns in dynamic cloud environments. To accurately distinguish between concurrent noise and real anomalies, we utilize a window-wise graph learning method, further improving the model’s noise resistance. We compared the detection performance of MemGT with 15 baseline methods across 8 public datasets. The experimental results demonstrate that our method achieves an average F1 score of 95.04%, surpassing state-of-the-art baseline methods by 24.80%.
format Article
id doaj-art-c7f2d26d21414cd8899ccf881b360279
institution Kabale University
issn 2192-113X
language English
publishDate 2025-07-01
publisher SpringerOpen
record_format Article
series Journal of Cloud Computing: Advances, Systems and Applications
spelling doaj-art-c7f2d26d21414cd8899ccf881b3602792025-08-20T03:46:24ZengSpringerOpenJournal of Cloud Computing: Advances, Systems and Applications2192-113X2025-07-0114111810.1186/s13677-025-00766-5Memory-augment graph transformer based unsupervised detection model for identifying performance anomalies in highly-dynamic cloud environmentsHuangyining Gao0Ruyue Xin1Peng Chen2Xi Li3Ning Lu4Peng You5School of Computer and Software EngineeringSchool of Artificial IntelligenceSchool of Computer and Software EngineeringSchool of Computer and Software EngineeringSchool of Computer and Software EngineeringSchool of Computer and Software EngineeringAbstract Cloud computing systems provide highly available and scalable computing, storage, and network resources to meet various service demands. Anomaly detection based on monitoring metrics plays a crucial role in identifying system defects and abnormal behaviors, ensuring the reliability and stability of cloud services. However, as the complexity of data increases and concurrent noise impacts cloud environments, server failures and abnormal events rise, making anomaly detection more challenging. To address these issues, we propose MemGT, an unsupervised multivariate time series anomaly detection method that offers high accuracy, good robustness, and noise resistance for various data patterns in complex cloud environments. Our approach utilizes a Transformer encoder and dynamic graph structure learning to extract spatio-temporal features of monitoring metrics in cloud computing systems in parallel. Additionally, we introduce a novel dynamic gated memory module to guide the Transformer encoder in extracting hidden features, thereby enhancing the model’s robustness to varying data patterns in dynamic cloud environments. To accurately distinguish between concurrent noise and real anomalies, we utilize a window-wise graph learning method, further improving the model’s noise resistance. We compared the detection performance of MemGT with 15 baseline methods across 8 public datasets. The experimental results demonstrate that our method achieves an average F1 score of 95.04%, surpassing state-of-the-art baseline methods by 24.80%.https://doi.org/10.1186/s13677-025-00766-5Cloud Computing SystemsPerformance AnomaliesMonitoring MetricsSpatio-temporal Feature LearningDynamic Memory ModuleConcurrent Noise
spellingShingle Huangyining Gao
Ruyue Xin
Peng Chen
Xi Li
Ning Lu
Peng You
Memory-augment graph transformer based unsupervised detection model for identifying performance anomalies in highly-dynamic cloud environments
Journal of Cloud Computing: Advances, Systems and Applications
Cloud Computing Systems
Performance Anomalies
Monitoring Metrics
Spatio-temporal Feature Learning
Dynamic Memory Module
Concurrent Noise
title Memory-augment graph transformer based unsupervised detection model for identifying performance anomalies in highly-dynamic cloud environments
title_full Memory-augment graph transformer based unsupervised detection model for identifying performance anomalies in highly-dynamic cloud environments
title_fullStr Memory-augment graph transformer based unsupervised detection model for identifying performance anomalies in highly-dynamic cloud environments
title_full_unstemmed Memory-augment graph transformer based unsupervised detection model for identifying performance anomalies in highly-dynamic cloud environments
title_short Memory-augment graph transformer based unsupervised detection model for identifying performance anomalies in highly-dynamic cloud environments
title_sort memory augment graph transformer based unsupervised detection model for identifying performance anomalies in highly dynamic cloud environments
topic Cloud Computing Systems
Performance Anomalies
Monitoring Metrics
Spatio-temporal Feature Learning
Dynamic Memory Module
Concurrent Noise
url https://doi.org/10.1186/s13677-025-00766-5
work_keys_str_mv AT huangyininggao memoryaugmentgraphtransformerbasedunsuperviseddetectionmodelforidentifyingperformanceanomaliesinhighlydynamiccloudenvironments
AT ruyuexin memoryaugmentgraphtransformerbasedunsuperviseddetectionmodelforidentifyingperformanceanomaliesinhighlydynamiccloudenvironments
AT pengchen memoryaugmentgraphtransformerbasedunsuperviseddetectionmodelforidentifyingperformanceanomaliesinhighlydynamiccloudenvironments
AT xili memoryaugmentgraphtransformerbasedunsuperviseddetectionmodelforidentifyingperformanceanomaliesinhighlydynamiccloudenvironments
AT ninglu memoryaugmentgraphtransformerbasedunsuperviseddetectionmodelforidentifyingperformanceanomaliesinhighlydynamiccloudenvironments
AT pengyou memoryaugmentgraphtransformerbasedunsuperviseddetectionmodelforidentifyingperformanceanomaliesinhighlydynamiccloudenvironments