Sequence-Growth : A Scalable and Effective Frequent Itemset Mining Algorithm for Big Data Based on MapReduce Framework
Abstract— Frequent itemset mining FIM < Final Year Projects 2016 > is an important research topic because it is widely applied in real world to ﬁnd the frequent itemsets and to mine human behavior patterns. FIM process is both memory and compute-intensive. As data grows exponentially every day, the problems of efﬁciency and scalability become more severe. we propose a new distributed FIM algorithm, called Sequence Growth, and implement it on MapReduce framework. Our algorithm applies the idea of lexicographical order to construct a tree, called” lexico graphical sequence tree”, that allows us to ﬁnd all frequent itemsets without exhaustive search over the transaction databases. In addition, the breadth-wide support-based pruning strategy is also an important factor to contribute the efﬁciency and scalability of our algorithm.
sales on Site11,021