Systems monitoring platform integrating artificial intelligence for incident response in servers

The increasing complexity of IT management and the need to monitor critical infrastructure metrics, such as CPU usage, memory, storage, and service logs, detect failures, and respond quickly to alerts, imply the adoption of advanced technologies that enable comprehensive monitoring and efficient re...

Full description

Saved in:
Bibliographic Details
Main Authors: Bruno Hiroshi Espinosa-Luna, Johann Castillo-Oliva, Willy Francisco García-Gutiérrez, Alberto Carlos Mendoza-de-los-Santos
Format: Article
Language:Spanish
Published: Universidad Nacional de San Martín 2025-07-01
Series:Revista Científica de Sistemas e Informática
Subjects:
Online Access:https://revistas.unsm.edu.pe/index.php/rcsi/article/view/811
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The increasing complexity of IT management and the need to monitor critical infrastructure metrics, such as CPU usage, memory, storage, and service logs, detect failures, and respond quickly to alerts, imply the adoption of advanced technologies that enable comprehensive monitoring and efficient response. This work developed a server monitoring system with alerts sent via Telegram. Additionally, it integrates artificial intelligence to provide immediate solutions to server incidents, using tools such as Grafana and Prometheus for metric collection and Grafana Loki for log management. The OpenAI API was incorporated to analyze the logs and enhance alerts with a detailed diagnosis. A total of 311 tests were conducted, where the results showed that the system notified incidents in an average of 1.02 seconds, while the GPT model completed the analysis in an average of 2.17 seconds, allowing root causes of problems to be identified and timely recommendations for resolution to be generated. It is concluded that the integration of artificial intelligence and proactive monitoring improves incident management, suggesting future applications in IoT environments to enrich monitoring.
ISSN:2709-992X