Performance Analysis: Discovering Semi-Markov Models From Event Logs

Process mining is a well-established discipline of data analysis focused on the discovery of process models from information systems’ event logs. Recently, an emerging subarea of process mining, known as stochastic process discovery, has started to evolve. Stochastic process discovery con...

Full description

Saved in:
Bibliographic Details
Main Authors: Anna Kalenkova, Lewis Mitchell, Matthew Roughan
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10904251/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849708527118974976
author Anna Kalenkova
Lewis Mitchell
Matthew Roughan
author_facet Anna Kalenkova
Lewis Mitchell
Matthew Roughan
author_sort Anna Kalenkova
collection DOAJ
description Process mining is a well-established discipline of data analysis focused on the discovery of process models from information systems’ event logs. Recently, an emerging subarea of process mining, known as stochastic process discovery, has started to evolve. Stochastic process discovery considers frequencies of events in the event data and allows for a more comprehensive analysis. In particular, when the durations of activities are presented in the event log, performance characteristics of the discovered stochastic models can be analyzed, e.g., the overall process execution time can be estimated. Existing performance analysis techniques usually discover stochastic process models from event data, and then simulate these models to evaluate their execution times. These methods rely on empirical approaches. This paper proposes analytical techniques for performance analysis that allow for the derivation of statistical characteristics of the overall processes’ execution times in the presence of arbitrary time distributions of events modeled by semi-Markov processes. The proposed methods include express analysis, focused on the mean execution time estimation, and full analysis techniques that build probability density functions (PDFs) of process execution times in both continuous and discrete forms. These methods are implemented and tested on real-world event data, demonstrating their potential for what-if analysis by providing solutions without resorting to simulation. Specifically, we demonstrated that the discrete approach is more time-efficient for small duration support sizes compared to the simulation technique. Furthermore, we showed that the continuous approach, with PDFs represented as Mixtures of Gaussian Models (GMMs), facilitates the discovery of more compact and interpretable models.
format Article
id doaj-art-4bafc9f7715440abbf1c7dc9998cebd9
institution DOAJ
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-4bafc9f7715440abbf1c7dc9998cebd92025-08-20T03:15:38ZengIEEEIEEE Access2169-35362025-01-0113380353805310.1109/ACCESS.2025.354603310904251Performance Analysis: Discovering Semi-Markov Models From Event LogsAnna Kalenkova0https://orcid.org/0000-0002-5088-7602Lewis Mitchell1https://orcid.org/0000-0001-8191-1997Matthew Roughan2https://orcid.org/0000-0002-7882-7329Adelaide Data Science Centre (ADSC), School of Computer and Mathematical Sciences, The University of Adelaide, North Terrace Campus, Adelaide, SA, AustraliaAdelaide Data Science Centre (ADSC), School of Computer and Mathematical Sciences, The University of Adelaide, North Terrace Campus, Adelaide, SA, AustraliaAdelaide Data Science Centre (ADSC), School of Computer and Mathematical Sciences, The University of Adelaide, North Terrace Campus, Adelaide, SA, AustraliaProcess mining is a well-established discipline of data analysis focused on the discovery of process models from information systems’ event logs. Recently, an emerging subarea of process mining, known as stochastic process discovery, has started to evolve. Stochastic process discovery considers frequencies of events in the event data and allows for a more comprehensive analysis. In particular, when the durations of activities are presented in the event log, performance characteristics of the discovered stochastic models can be analyzed, e.g., the overall process execution time can be estimated. Existing performance analysis techniques usually discover stochastic process models from event data, and then simulate these models to evaluate their execution times. These methods rely on empirical approaches. This paper proposes analytical techniques for performance analysis that allow for the derivation of statistical characteristics of the overall processes’ execution times in the presence of arbitrary time distributions of events modeled by semi-Markov processes. The proposed methods include express analysis, focused on the mean execution time estimation, and full analysis techniques that build probability density functions (PDFs) of process execution times in both continuous and discrete forms. These methods are implemented and tested on real-world event data, demonstrating their potential for what-if analysis by providing solutions without resorting to simulation. Specifically, we demonstrated that the discrete approach is more time-efficient for small duration support sizes compared to the simulation technique. Furthermore, we showed that the continuous approach, with PDFs represented as Mixtures of Gaussian Models (GMMs), facilitates the discovery of more compact and interpretable models.https://ieeexplore.ieee.org/document/10904251/Event logsGaussian mixture modelsperformance analysisprocess miningsemi-Markov processestime distributions
spellingShingle Anna Kalenkova
Lewis Mitchell
Matthew Roughan
Performance Analysis: Discovering Semi-Markov Models From Event Logs
IEEE Access
Event logs
Gaussian mixture models
performance analysis
process mining
semi-Markov processes
time distributions
title Performance Analysis: Discovering Semi-Markov Models From Event Logs
title_full Performance Analysis: Discovering Semi-Markov Models From Event Logs
title_fullStr Performance Analysis: Discovering Semi-Markov Models From Event Logs
title_full_unstemmed Performance Analysis: Discovering Semi-Markov Models From Event Logs
title_short Performance Analysis: Discovering Semi-Markov Models From Event Logs
title_sort performance analysis discovering semi markov models from event logs
topic Event logs
Gaussian mixture models
performance analysis
process mining
semi-Markov processes
time distributions
url https://ieeexplore.ieee.org/document/10904251/
work_keys_str_mv AT annakalenkova performanceanalysisdiscoveringsemimarkovmodelsfromeventlogs
AT lewismitchell performanceanalysisdiscoveringsemimarkovmodelsfromeventlogs
AT matthewroughan performanceanalysisdiscoveringsemimarkovmodelsfromeventlogs