A Novel Depth-First Search Algorithm for Partial Periodic-Frequent Pattern Mining in Temporal Databases

Partial periodic-frequent pattern mining is a critical technique in the data mining field. This technique finds all frequent patterns demonstrating partial periodicity within temporal datasets. Despite its potential, the broader industrial adoption of this technique has been limited due to two signi...

Full description

Saved in:
Bibliographic Details
Main Authors: Pamalla Veena, Vanitha Kattumuri, Yutaka Watanobe, Rage Uday Kiran, So Nakamura, Palla Likhitha, Koji Zettsu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11045405/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Partial periodic-frequent pattern mining is a critical technique in the data mining field. This technique finds all frequent patterns demonstrating partial periodicity within temporal datasets. Despite its potential, the broader industrial adoption of this technique has been limited due to two significant challenges. First, there is currently no algorithm capable of efficiently identifying the required patterns in columnar temporal databases, which are increasingly prevalent in real-world applications. Second, the existing algorithms suffer from high computational demands regarding processing time (or runtime) and memory usage, thus making them unsuitable for analyzing large-scale datasets effectively. To address these challenges, this paper proposes the Generalized Partial Periodic-Frequent Depth-First Search (GPPF-DFS) algorithm. Unlike GPF-growth, which is limited to row-based databases, and ECLAT, which cannot handle shared timestamps, GPPF-DFS supports both row-based and columnar formats and can process multiple transactions with identical timestamps. It compresses temporal databases into a unified dictionary structure, enabling efficient data traversal and recursive mining. The experimental results across various datasets demonstrate that our proposed algorithm significantly outperforms GPF-growth in terms of effectiveness and computational efficiency. Furthermore, a scalability test has been conducted to assess the performance of the proposed algorithm. Additionally, we demonstrate the practical applicability of GPPF-DFS through case studies on air pollution and retail analytics. The algorithm identifies periodic patterns in air pollution data, offering valuable insights for environmental monitoring. In retail analytics, it uncovers patterns that inform business strategies and enhance customer experience.
ISSN:2169-3536