Product Description
Parallel Frequent Item set Mining with Spark RDD Framework for Disease Prediction
Abstract—The aim behind frequent itemset mining is to find all common sets of items defined as those item sets that have at least a minimum support. There are many well known algorithms for frequent itemset mining. Some of which are A priori, Éclat, RElim, SaM, and FP-Growth. Although each of these algorithms is well formed and works in different scenarios, the main drawback of these algorithms is that they were designed to perform on small chunks of data. These limitations were imposed based on time that they were developed. The notion of big data was not up and running at these times. So in the present scenario these algorithms won’t perform well on the current statistics of data present. So we propose a new approach of implementing these well known algorithms on a parallelized manner so that it can handle the data perfectly. The proposed work parallelizes, dynamic frequent item set mining algorithm, Faster-IAPI with spark RDD framework. The main goal of selecting Apache Spark is that it overcomes the limitations of the Hadoop architecture which was basically designed to handle big data processing in a parallelized manner. The main drawback of the architecture was that it doesn’t handle the Iterative algorithms very well. This drawback is rectified in spark which handles it well. In this approach this algorithm is applied to find correlation between different symptoms of patients in faster and efficient manner and provides the support for the prediction of occurrence of disease based on the symptoms. Keywords— Frequent item set mining, Faster-IAPI algorithm,Spark RDD, Parallel computing, Disease prediction< final year projects >
Including Packages
Our Specialization
Support Service
Statistical Report
satisfied customers
3,589Freelance projects
983sales on Site
11,021developers
175+