Simulation Study on How Input Data Affects Time-Series Classification Model Results
This paper discusses the results of a study investigating how input data characteristics affect the performance of time-series classification models. In this experiment, we used 82 synthetically generated time-series datasets, created based on predefined functions with added noise. These datasets va...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Entropy |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1099-4300/27/6/624 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849472190382080000 |
|---|---|
| author | Maria Sadowska Krzysztof Gajowniczek |
| author_facet | Maria Sadowska Krzysztof Gajowniczek |
| author_sort | Maria Sadowska |
| collection | DOAJ |
| description | This paper discusses the results of a study investigating how input data characteristics affect the performance of time-series classification models. In this experiment, we used 82 synthetically generated time-series datasets, created based on predefined functions with added noise. These datasets varied in structure, including differences in the number of classes and noise levels, while maintaining a consistent length and total number of observations. This design allowed us to systematically assess the influence of dataset characteristics on classification outcomes. Seven classification models were evaluated and their performance was compared using accuracy metrics, training time and memory requirements. According to the evaluation, the CNN Classifier achieved the best results, demonstrating the highest robustness to an increasing number of classes and noise. In contrast, the least effective model was the Catch22 Classifier. Overall, the performed research leads to the conclusion that as the number of classes and the level of noise in the data increase, all classification models become less effective, achieving lower accuracy metrics. |
| format | Article |
| id | doaj-art-58e2b387202c4ebb9e9943d40d24aec2 |
| institution | Kabale University |
| issn | 1099-4300 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Entropy |
| spelling | doaj-art-58e2b387202c4ebb9e9943d40d24aec22025-08-20T03:24:36ZengMDPI AGEntropy1099-43002025-06-0127662410.3390/e27060624Simulation Study on How Input Data Affects Time-Series Classification Model ResultsMaria Sadowska0Krzysztof Gajowniczek1Institute of Information Technology, Warsaw University of Life Sciences-SGGW, 02-787 Warszawa, PolandInstitute of Information Technology, Warsaw University of Life Sciences-SGGW, 02-787 Warszawa, PolandThis paper discusses the results of a study investigating how input data characteristics affect the performance of time-series classification models. In this experiment, we used 82 synthetically generated time-series datasets, created based on predefined functions with added noise. These datasets varied in structure, including differences in the number of classes and noise levels, while maintaining a consistent length and total number of observations. This design allowed us to systematically assess the influence of dataset characteristics on classification outcomes. Seven classification models were evaluated and their performance was compared using accuracy metrics, training time and memory requirements. According to the evaluation, the CNN Classifier achieved the best results, demonstrating the highest robustness to an increasing number of classes and noise. In contrast, the least effective model was the Catch22 Classifier. Overall, the performed research leads to the conclusion that as the number of classes and the level of noise in the data increase, all classification models become less effective, achieving lower accuracy metrics.https://www.mdpi.com/1099-4300/27/6/624time seriesclassificationsynthetic data |
| spellingShingle | Maria Sadowska Krzysztof Gajowniczek Simulation Study on How Input Data Affects Time-Series Classification Model Results Entropy time series classification synthetic data |
| title | Simulation Study on How Input Data Affects Time-Series Classification Model Results |
| title_full | Simulation Study on How Input Data Affects Time-Series Classification Model Results |
| title_fullStr | Simulation Study on How Input Data Affects Time-Series Classification Model Results |
| title_full_unstemmed | Simulation Study on How Input Data Affects Time-Series Classification Model Results |
| title_short | Simulation Study on How Input Data Affects Time-Series Classification Model Results |
| title_sort | simulation study on how input data affects time series classification model results |
| topic | time series classification synthetic data |
| url | https://www.mdpi.com/1099-4300/27/6/624 |
| work_keys_str_mv | AT mariasadowska simulationstudyonhowinputdataaffectstimeseriesclassificationmodelresults AT krzysztofgajowniczek simulationstudyonhowinputdataaffectstimeseriesclassificationmodelresults |