A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> Facilities

The ALICE (A Large Ion Collider Experiment) detector at the Large Hadron Collider (LHC), operated by the European Organization for Nuclear Research (CERN), is dedicated to heavy-ion collisions. Within ALICE, the application logs of the online computing systems are consolidated through a logging syst...

Full description

Saved in:
Bibliographic Details
Main Authors: Arnatchai Techaviseschai, Sansiri Tarnpradab, Vasco Chibante Barroso, Phond Phunchongharn
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/11/5901
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849330863345500160
author Arnatchai Techaviseschai
Sansiri Tarnpradab
Vasco Chibante Barroso
Phond Phunchongharn
author_facet Arnatchai Techaviseschai
Sansiri Tarnpradab
Vasco Chibante Barroso
Phond Phunchongharn
author_sort Arnatchai Techaviseschai
collection DOAJ
description The ALICE (A Large Ion Collider Experiment) detector at the Large Hadron Collider (LHC), operated by the European Organization for Nuclear Research (CERN), is dedicated to heavy-ion collisions. Within ALICE, the application logs of the online computing systems are consolidated through a logging system known as Infologger, which integrates data from various sources. To identify potential anomalies, shifters in the control room manually review logs for anomalies, which require significant expertise and pose challenges due to the frequent onboarding of new personnel. To address this issue, we propose a real-time semi-supervised log anomaly detection framework designed to automatically detect anomalies in ALICE operations. The framework leverages BERTopic, a topic modeling technique, to provide real-time insights for incoming log messages for shifters. This includes an analytical dashboard that represents the anomaly status in log messages, facilitating informative monitoring for shifters. Through evaluation, including Infologger and BGL (BlueGene/L supercomputer), we analyze the effects of word embeddings, clustering algorithms, and HDBSCAN hyperparameters on model performance. The result demonstrates that the BERTopic can enhance the log anomaly detection process over traditional topic models, achieving remarkable performance metrics and attaining F1-scores of 0.957 and 0.958 for the InfoLogger and BGL datasets, respectively, even without the preprocessing technique.
format Article
id doaj-art-1b3572c204c742a288395eb36edea080
institution Kabale University
issn 2076-3417
language English
publishDate 2025-05-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-1b3572c204c742a288395eb36edea0802025-08-20T03:46:47ZengMDPI AGApplied Sciences2076-34172025-05-011511590110.3390/app15115901A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> FacilitiesArnatchai Techaviseschai0Sansiri Tarnpradab1Vasco Chibante Barroso2Phond Phunchongharn3Department of Computer Engineering, King Mongkut’s University of Technology Thonburi, Bangkok 10140, ThailandDepartment of Computer Engineering, King Mongkut’s University of Technology Thonburi, Bangkok 10140, ThailandExperimental Physics Department, European Organization for Nuclear Research (CERN), 1211 Geneva, SwitzerlandDepartment of Computer Engineering, King Mongkut’s University of Technology Thonburi, Bangkok 10140, ThailandThe ALICE (A Large Ion Collider Experiment) detector at the Large Hadron Collider (LHC), operated by the European Organization for Nuclear Research (CERN), is dedicated to heavy-ion collisions. Within ALICE, the application logs of the online computing systems are consolidated through a logging system known as Infologger, which integrates data from various sources. To identify potential anomalies, shifters in the control room manually review logs for anomalies, which require significant expertise and pose challenges due to the frequent onboarding of new personnel. To address this issue, we propose a real-time semi-supervised log anomaly detection framework designed to automatically detect anomalies in ALICE operations. The framework leverages BERTopic, a topic modeling technique, to provide real-time insights for incoming log messages for shifters. This includes an analytical dashboard that represents the anomaly status in log messages, facilitating informative monitoring for shifters. Through evaluation, including Infologger and BGL (BlueGene/L supercomputer), we analyze the effects of word embeddings, clustering algorithms, and HDBSCAN hyperparameters on model performance. The result demonstrates that the BERTopic can enhance the log anomaly detection process over traditional topic models, achieving remarkable performance metrics and attaining F1-scores of 0.957 and 0.958 for the InfoLogger and BGL datasets, respectively, even without the preprocessing technique.https://www.mdpi.com/2076-3417/15/11/5901ALICE experimentBERTopicclusteringFLP clustermachine learningtopic modeling
spellingShingle Arnatchai Techaviseschai
Sansiri Tarnpradab
Vasco Chibante Barroso
Phond Phunchongharn
A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> Facilities
Applied Sciences
ALICE experiment
BERTopic
clustering
FLP cluster
machine learning
topic modeling
title A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> Facilities
title_full A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> Facilities
title_fullStr A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> Facilities
title_full_unstemmed A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> Facilities
title_short A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> Facilities
title_sort real time semi supervised log anomaly detection framework for alice o sup 2 sup facilities
topic ALICE experiment
BERTopic
clustering
FLP cluster
machine learning
topic modeling
url https://www.mdpi.com/2076-3417/15/11/5901
work_keys_str_mv AT arnatchaitechaviseschai arealtimesemisupervisedloganomalydetectionframeworkforaliceosup2supfacilities
AT sansiritarnpradab arealtimesemisupervisedloganomalydetectionframeworkforaliceosup2supfacilities
AT vascochibantebarroso arealtimesemisupervisedloganomalydetectionframeworkforaliceosup2supfacilities
AT phondphunchongharn arealtimesemisupervisedloganomalydetectionframeworkforaliceosup2supfacilities
AT arnatchaitechaviseschai realtimesemisupervisedloganomalydetectionframeworkforaliceosup2supfacilities
AT sansiritarnpradab realtimesemisupervisedloganomalydetectionframeworkforaliceosup2supfacilities
AT vascochibantebarroso realtimesemisupervisedloganomalydetectionframeworkforaliceosup2supfacilities
AT phondphunchongharn realtimesemisupervisedloganomalydetectionframeworkforaliceosup2supfacilities