Mining frequent patterns from dynamic data streams with data load management
Abstract—In many applications, data-stream sources are prone to dramatic spikes in volume, which necessitates load shedding for data-stream processing systems. In this research, we study the load-shedding problem for frequent-pattern discovery in transactional data streams.< Final Year Project > A load-controllable mining system with an ε-deficient mining algorithm and three dedicated load-shedding schemes is proposed. When the system is overloaded, a load-shedding scheme is executed to prune a fraction of unprocessed data. From the experimental result, we find that the strategies of load shedding can indeed lighten the system workload while preserving the mining accuracy at an acceptable level.