A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> Facilities
The ALICE (A Large Ion Collider Experiment) detector at the Large Hadron Collider (LHC), operated by the European Organization for Nuclear Research (CERN), is dedicated to heavy-ion collisions. Within ALICE, the application logs of the online computing systems are consolidated through a logging syst...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/11/5901 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849330863345500160 |
|---|---|
| author | Arnatchai Techaviseschai Sansiri Tarnpradab Vasco Chibante Barroso Phond Phunchongharn |
| author_facet | Arnatchai Techaviseschai Sansiri Tarnpradab Vasco Chibante Barroso Phond Phunchongharn |
| author_sort | Arnatchai Techaviseschai |
| collection | DOAJ |
| description | The ALICE (A Large Ion Collider Experiment) detector at the Large Hadron Collider (LHC), operated by the European Organization for Nuclear Research (CERN), is dedicated to heavy-ion collisions. Within ALICE, the application logs of the online computing systems are consolidated through a logging system known as Infologger, which integrates data from various sources. To identify potential anomalies, shifters in the control room manually review logs for anomalies, which require significant expertise and pose challenges due to the frequent onboarding of new personnel. To address this issue, we propose a real-time semi-supervised log anomaly detection framework designed to automatically detect anomalies in ALICE operations. The framework leverages BERTopic, a topic modeling technique, to provide real-time insights for incoming log messages for shifters. This includes an analytical dashboard that represents the anomaly status in log messages, facilitating informative monitoring for shifters. Through evaluation, including Infologger and BGL (BlueGene/L supercomputer), we analyze the effects of word embeddings, clustering algorithms, and HDBSCAN hyperparameters on model performance. The result demonstrates that the BERTopic can enhance the log anomaly detection process over traditional topic models, achieving remarkable performance metrics and attaining F1-scores of 0.957 and 0.958 for the InfoLogger and BGL datasets, respectively, even without the preprocessing technique. |
| format | Article |
| id | doaj-art-1b3572c204c742a288395eb36edea080 |
| institution | Kabale University |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-1b3572c204c742a288395eb36edea0802025-08-20T03:46:47ZengMDPI AGApplied Sciences2076-34172025-05-011511590110.3390/app15115901A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> FacilitiesArnatchai Techaviseschai0Sansiri Tarnpradab1Vasco Chibante Barroso2Phond Phunchongharn3Department of Computer Engineering, King Mongkut’s University of Technology Thonburi, Bangkok 10140, ThailandDepartment of Computer Engineering, King Mongkut’s University of Technology Thonburi, Bangkok 10140, ThailandExperimental Physics Department, European Organization for Nuclear Research (CERN), 1211 Geneva, SwitzerlandDepartment of Computer Engineering, King Mongkut’s University of Technology Thonburi, Bangkok 10140, ThailandThe ALICE (A Large Ion Collider Experiment) detector at the Large Hadron Collider (LHC), operated by the European Organization for Nuclear Research (CERN), is dedicated to heavy-ion collisions. Within ALICE, the application logs of the online computing systems are consolidated through a logging system known as Infologger, which integrates data from various sources. To identify potential anomalies, shifters in the control room manually review logs for anomalies, which require significant expertise and pose challenges due to the frequent onboarding of new personnel. To address this issue, we propose a real-time semi-supervised log anomaly detection framework designed to automatically detect anomalies in ALICE operations. The framework leverages BERTopic, a topic modeling technique, to provide real-time insights for incoming log messages for shifters. This includes an analytical dashboard that represents the anomaly status in log messages, facilitating informative monitoring for shifters. Through evaluation, including Infologger and BGL (BlueGene/L supercomputer), we analyze the effects of word embeddings, clustering algorithms, and HDBSCAN hyperparameters on model performance. The result demonstrates that the BERTopic can enhance the log anomaly detection process over traditional topic models, achieving remarkable performance metrics and attaining F1-scores of 0.957 and 0.958 for the InfoLogger and BGL datasets, respectively, even without the preprocessing technique.https://www.mdpi.com/2076-3417/15/11/5901ALICE experimentBERTopicclusteringFLP clustermachine learningtopic modeling |
| spellingShingle | Arnatchai Techaviseschai Sansiri Tarnpradab Vasco Chibante Barroso Phond Phunchongharn A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> Facilities Applied Sciences ALICE experiment BERTopic clustering FLP cluster machine learning topic modeling |
| title | A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> Facilities |
| title_full | A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> Facilities |
| title_fullStr | A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> Facilities |
| title_full_unstemmed | A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> Facilities |
| title_short | A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> Facilities |
| title_sort | real time semi supervised log anomaly detection framework for alice o sup 2 sup facilities |
| topic | ALICE experiment BERTopic clustering FLP cluster machine learning topic modeling |
| url | https://www.mdpi.com/2076-3417/15/11/5901 |
| work_keys_str_mv | AT arnatchaitechaviseschai arealtimesemisupervisedloganomalydetectionframeworkforaliceosup2supfacilities AT sansiritarnpradab arealtimesemisupervisedloganomalydetectionframeworkforaliceosup2supfacilities AT vascochibantebarroso arealtimesemisupervisedloganomalydetectionframeworkforaliceosup2supfacilities AT phondphunchongharn arealtimesemisupervisedloganomalydetectionframeworkforaliceosup2supfacilities AT arnatchaitechaviseschai realtimesemisupervisedloganomalydetectionframeworkforaliceosup2supfacilities AT sansiritarnpradab realtimesemisupervisedloganomalydetectionframeworkforaliceosup2supfacilities AT vascochibantebarroso realtimesemisupervisedloganomalydetectionframeworkforaliceosup2supfacilities AT phondphunchongharn realtimesemisupervisedloganomalydetectionframeworkforaliceosup2supfacilities |