Improvement of Aprioi algorithm under MapReduce frame
Wang Wang Junhong Yu Jiao Gedommei
Massive data mining using traditional Apriori algorithm will waste a lot of storage space and communication resources, resulting in inefficient algorithm, therefore, proposed the MapReduce framework of the APRIOI algorithm, first of all, using the horizontal partition method to divide the MapReduce database into n separate data blocks , and then sent to the M work node with dynamic load balancing. Each node scans its own data blocks, produces local candidate frequent itemsets, calculates the support threshold for each candidate frequent itemsets and compares the minimum support threshold to determine the final frequent itemsets. Improved algorithm can reduce data flow between nodes , only need to scan two times transaction database can excavate all frequent itemsets, save scanning time and storage space, improve mining efficiency.
Improvement of Aprioi algorithm under MapReduce frame