Product Description
Real-time stream data mining based on CanTree and Gtree
Abstract— We face an increasing need to discover knowledge from data streams in real-time. Real- time stream data mining needs a compact data structure to store transactions in the recent sliding-window by one scan, and an efficient algorithm to discover frequent itemsets from the compact data structure. In this paper, we propose a novel data mining algorithm, called CanTree-GTree, which discovers the complete frequent itemsets from real-time transactions based on sliding-windows. The algorithm uses two data structures: CanTree and GTree. CanTree compactly represents all transactions in a sliding-window by one scan, and serves as a base-tree. The algorithm efficiently maintains the base-tree by adding new trans- actions and removing old transactions without any reconstruction phases. A novel data structure, called GTree (Group Tree), serves as a projection-tree for each data item. The algorithm traverses each node of the base-tree only once by using a top-down tree traver- sal method to build the projection-tree, and discovers frequent itemsets by low process- ing cost. The proposed algorithm is therefore effective for discovering frequent itemsets in real-time stream data. Our performance evaluation experiments with other algorithms based on CPSTree and CanTree-FPTree show that our algorithm outperforms the other al- gorithms in the synthetic data set by about 35% and 26% of run-time cost, respectively. Also, we confirm that the proposed algorithm shows excellent results on real-world data sets. < final year projects >
Including Packages
Our Specialization
Support Service
Statistical Report
satisfied customers
3,589Freelance projects
983sales on Site
11,021developers
175+