Memory-augment graph transformer based unsupervised detection model for identifying performance anomalies in highly-dynamic cloud environments
Abstract Cloud computing systems provide highly available and scalable computing, storage, and network resources to meet various service demands. Anomaly detection based on monitoring metrics plays a crucial role in identifying system defects and abnormal behaviors, ensuring the reliability and stab...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
SpringerOpen
2025-07-01
|
| Series: | Journal of Cloud Computing: Advances, Systems and Applications |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s13677-025-00766-5 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849331766945382400 |
|---|---|
| author | Huangyining Gao Ruyue Xin Peng Chen Xi Li Ning Lu Peng You |
| author_facet | Huangyining Gao Ruyue Xin Peng Chen Xi Li Ning Lu Peng You |
| author_sort | Huangyining Gao |
| collection | DOAJ |
| description | Abstract Cloud computing systems provide highly available and scalable computing, storage, and network resources to meet various service demands. Anomaly detection based on monitoring metrics plays a crucial role in identifying system defects and abnormal behaviors, ensuring the reliability and stability of cloud services. However, as the complexity of data increases and concurrent noise impacts cloud environments, server failures and abnormal events rise, making anomaly detection more challenging. To address these issues, we propose MemGT, an unsupervised multivariate time series anomaly detection method that offers high accuracy, good robustness, and noise resistance for various data patterns in complex cloud environments. Our approach utilizes a Transformer encoder and dynamic graph structure learning to extract spatio-temporal features of monitoring metrics in cloud computing systems in parallel. Additionally, we introduce a novel dynamic gated memory module to guide the Transformer encoder in extracting hidden features, thereby enhancing the model’s robustness to varying data patterns in dynamic cloud environments. To accurately distinguish between concurrent noise and real anomalies, we utilize a window-wise graph learning method, further improving the model’s noise resistance. We compared the detection performance of MemGT with 15 baseline methods across 8 public datasets. The experimental results demonstrate that our method achieves an average F1 score of 95.04%, surpassing state-of-the-art baseline methods by 24.80%. |
| format | Article |
| id | doaj-art-c7f2d26d21414cd8899ccf881b360279 |
| institution | Kabale University |
| issn | 2192-113X |
| language | English |
| publishDate | 2025-07-01 |
| publisher | SpringerOpen |
| record_format | Article |
| series | Journal of Cloud Computing: Advances, Systems and Applications |
| spelling | doaj-art-c7f2d26d21414cd8899ccf881b3602792025-08-20T03:46:24ZengSpringerOpenJournal of Cloud Computing: Advances, Systems and Applications2192-113X2025-07-0114111810.1186/s13677-025-00766-5Memory-augment graph transformer based unsupervised detection model for identifying performance anomalies in highly-dynamic cloud environmentsHuangyining Gao0Ruyue Xin1Peng Chen2Xi Li3Ning Lu4Peng You5School of Computer and Software EngineeringSchool of Artificial IntelligenceSchool of Computer and Software EngineeringSchool of Computer and Software EngineeringSchool of Computer and Software EngineeringSchool of Computer and Software EngineeringAbstract Cloud computing systems provide highly available and scalable computing, storage, and network resources to meet various service demands. Anomaly detection based on monitoring metrics plays a crucial role in identifying system defects and abnormal behaviors, ensuring the reliability and stability of cloud services. However, as the complexity of data increases and concurrent noise impacts cloud environments, server failures and abnormal events rise, making anomaly detection more challenging. To address these issues, we propose MemGT, an unsupervised multivariate time series anomaly detection method that offers high accuracy, good robustness, and noise resistance for various data patterns in complex cloud environments. Our approach utilizes a Transformer encoder and dynamic graph structure learning to extract spatio-temporal features of monitoring metrics in cloud computing systems in parallel. Additionally, we introduce a novel dynamic gated memory module to guide the Transformer encoder in extracting hidden features, thereby enhancing the model’s robustness to varying data patterns in dynamic cloud environments. To accurately distinguish between concurrent noise and real anomalies, we utilize a window-wise graph learning method, further improving the model’s noise resistance. We compared the detection performance of MemGT with 15 baseline methods across 8 public datasets. The experimental results demonstrate that our method achieves an average F1 score of 95.04%, surpassing state-of-the-art baseline methods by 24.80%.https://doi.org/10.1186/s13677-025-00766-5Cloud Computing SystemsPerformance AnomaliesMonitoring MetricsSpatio-temporal Feature LearningDynamic Memory ModuleConcurrent Noise |
| spellingShingle | Huangyining Gao Ruyue Xin Peng Chen Xi Li Ning Lu Peng You Memory-augment graph transformer based unsupervised detection model for identifying performance anomalies in highly-dynamic cloud environments Journal of Cloud Computing: Advances, Systems and Applications Cloud Computing Systems Performance Anomalies Monitoring Metrics Spatio-temporal Feature Learning Dynamic Memory Module Concurrent Noise |
| title | Memory-augment graph transformer based unsupervised detection model for identifying performance anomalies in highly-dynamic cloud environments |
| title_full | Memory-augment graph transformer based unsupervised detection model for identifying performance anomalies in highly-dynamic cloud environments |
| title_fullStr | Memory-augment graph transformer based unsupervised detection model for identifying performance anomalies in highly-dynamic cloud environments |
| title_full_unstemmed | Memory-augment graph transformer based unsupervised detection model for identifying performance anomalies in highly-dynamic cloud environments |
| title_short | Memory-augment graph transformer based unsupervised detection model for identifying performance anomalies in highly-dynamic cloud environments |
| title_sort | memory augment graph transformer based unsupervised detection model for identifying performance anomalies in highly dynamic cloud environments |
| topic | Cloud Computing Systems Performance Anomalies Monitoring Metrics Spatio-temporal Feature Learning Dynamic Memory Module Concurrent Noise |
| url | https://doi.org/10.1186/s13677-025-00766-5 |
| work_keys_str_mv | AT huangyininggao memoryaugmentgraphtransformerbasedunsuperviseddetectionmodelforidentifyingperformanceanomaliesinhighlydynamiccloudenvironments AT ruyuexin memoryaugmentgraphtransformerbasedunsuperviseddetectionmodelforidentifyingperformanceanomaliesinhighlydynamiccloudenvironments AT pengchen memoryaugmentgraphtransformerbasedunsuperviseddetectionmodelforidentifyingperformanceanomaliesinhighlydynamiccloudenvironments AT xili memoryaugmentgraphtransformerbasedunsuperviseddetectionmodelforidentifyingperformanceanomaliesinhighlydynamiccloudenvironments AT ninglu memoryaugmentgraphtransformerbasedunsuperviseddetectionmodelforidentifyingperformanceanomaliesinhighlydynamiccloudenvironments AT pengyou memoryaugmentgraphtransformerbasedunsuperviseddetectionmodelforidentifyingperformanceanomaliesinhighlydynamiccloudenvironments |