Simulation Study on How Input Data Affects Time-Series Classification Model Results

This paper discusses the results of a study investigating how input data characteristics affect the performance of time-series classification models. In this experiment, we used 82 synthetically generated time-series datasets, created based on predefined functions with added noise. These datasets va...

Full description

Saved in:
Bibliographic Details
Main Authors: Maria Sadowska, Krzysztof Gajowniczek
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/27/6/624
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper discusses the results of a study investigating how input data characteristics affect the performance of time-series classification models. In this experiment, we used 82 synthetically generated time-series datasets, created based on predefined functions with added noise. These datasets varied in structure, including differences in the number of classes and noise levels, while maintaining a consistent length and total number of observations. This design allowed us to systematically assess the influence of dataset characteristics on classification outcomes. Seven classification models were evaluated and their performance was compared using accuracy metrics, training time and memory requirements. According to the evaluation, the CNN Classifier achieved the best results, demonstrating the highest robustness to an increasing number of classes and noise. In contrast, the least effective model was the Catch22 Classifier. Overall, the performed research leads to the conclusion that as the number of classes and the level of noise in the data increase, all classification models become less effective, achieving lower accuracy metrics.
ISSN:1099-4300