PRESEE: An MDL/MML Algorithm to Time-Series Stream Segmenting
Time-series stream is one of the most common data types in data mining field. It is prevalent in fields such as stock market, ecology, and medical care. Segmentation is a key step to accelerate the processing speed of time-series stream mining. Previous algorithms for segmenting mainly focused on th...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2013-01-01
|
Series: | The Scientific World Journal |
Online Access: | http://dx.doi.org/10.1155/2013/386180 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832563455125618688 |
---|---|
author | Kaikuo Xu Yexi Jiang Mingjie Tang Changan Yuan Changjie Tang |
author_facet | Kaikuo Xu Yexi Jiang Mingjie Tang Changan Yuan Changjie Tang |
author_sort | Kaikuo Xu |
collection | DOAJ |
description | Time-series stream is one of the most common data types in data mining field. It is prevalent in fields such as stock market, ecology, and medical care. Segmentation is a key step to accelerate the processing speed of time-series stream mining. Previous algorithms for segmenting mainly focused on the issue of ameliorating precision instead of paying much attention to the efficiency. Moreover, the performance of these algorithms depends heavily on parameters, which are hard for the users to set. In this paper, we propose PRESEE (parameter-free, real-time, and scalable time-series stream segmenting algorithm), which greatly improves the efficiency of time-series stream segmenting. PRESEE is based on both MDL (minimum description length) and MML (minimum message length) methods, which could segment the data automatically. To evaluate the performance of PRESEE, we conduct several experiments on time-series streams of different types and compare it with the state-of-art algorithm. The empirical results show that PRESEE is very efficient for real-time stream datasets by improving segmenting speed nearly ten times. The novelty of this algorithm is further demonstrated by the application of PRESEE in segmenting real-time stream datasets from ChinaFLUX sensor networks data stream. |
format | Article |
id | doaj-art-c9a4c8906a7946a8add413ee32719121 |
institution | Kabale University |
issn | 1537-744X |
language | English |
publishDate | 2013-01-01 |
publisher | Wiley |
record_format | Article |
series | The Scientific World Journal |
spelling | doaj-art-c9a4c8906a7946a8add413ee327191212025-02-03T01:20:15ZengWileyThe Scientific World Journal1537-744X2013-01-01201310.1155/2013/386180386180PRESEE: An MDL/MML Algorithm to Time-Series Stream SegmentingKaikuo Xu0Yexi Jiang1Mingjie Tang2Changan Yuan3Changjie Tang4College of Computer Science & Technology, Chengdu University of Information Technology, Chengdu 610225, ChinaSchool of Computing and Information Sciences, Florida International University, Miami, IN 33199, USADepartment of Computer Science, Purdue University, West Lafayette, FL 47996, USAGuangxi Teachers Education University, Nanning 530001, ChinaSchool of Computer Science, Sichuan University, Chengdu 610065, ChinaTime-series stream is one of the most common data types in data mining field. It is prevalent in fields such as stock market, ecology, and medical care. Segmentation is a key step to accelerate the processing speed of time-series stream mining. Previous algorithms for segmenting mainly focused on the issue of ameliorating precision instead of paying much attention to the efficiency. Moreover, the performance of these algorithms depends heavily on parameters, which are hard for the users to set. In this paper, we propose PRESEE (parameter-free, real-time, and scalable time-series stream segmenting algorithm), which greatly improves the efficiency of time-series stream segmenting. PRESEE is based on both MDL (minimum description length) and MML (minimum message length) methods, which could segment the data automatically. To evaluate the performance of PRESEE, we conduct several experiments on time-series streams of different types and compare it with the state-of-art algorithm. The empirical results show that PRESEE is very efficient for real-time stream datasets by improving segmenting speed nearly ten times. The novelty of this algorithm is further demonstrated by the application of PRESEE in segmenting real-time stream datasets from ChinaFLUX sensor networks data stream.http://dx.doi.org/10.1155/2013/386180 |
spellingShingle | Kaikuo Xu Yexi Jiang Mingjie Tang Changan Yuan Changjie Tang PRESEE: An MDL/MML Algorithm to Time-Series Stream Segmenting The Scientific World Journal |
title | PRESEE: An MDL/MML Algorithm to Time-Series Stream Segmenting |
title_full | PRESEE: An MDL/MML Algorithm to Time-Series Stream Segmenting |
title_fullStr | PRESEE: An MDL/MML Algorithm to Time-Series Stream Segmenting |
title_full_unstemmed | PRESEE: An MDL/MML Algorithm to Time-Series Stream Segmenting |
title_short | PRESEE: An MDL/MML Algorithm to Time-Series Stream Segmenting |
title_sort | presee an mdl mml algorithm to time series stream segmenting |
url | http://dx.doi.org/10.1155/2013/386180 |
work_keys_str_mv | AT kaikuoxu preseeanmdlmmlalgorithmtotimeseriesstreamsegmenting AT yexijiang preseeanmdlmmlalgorithmtotimeseriesstreamsegmenting AT mingjietang preseeanmdlmmlalgorithmtotimeseriesstreamsegmenting AT changanyuan preseeanmdlmmlalgorithmtotimeseriesstreamsegmenting AT changjietang preseeanmdlmmlalgorithmtotimeseriesstreamsegmenting |