Association Rules Association The Rule text: The concurrency relationship between words: Regardless of sequence order, sequence mining considers the basic concepts of sequence:An association rule is a implied relationship of the following form:X->y, and no intersection support count
metrics to measure the strength of association rules:Support: If the level of support is too small, it is likely that the rule will only happen accidentally and that the food covered is rarely worthless. Confidence level: predictability, if the confidence level is too low, it is difficult to reliably make the rule inference.
Target identifies all association rules that satisfy the minimum support and minimum confidence levels of a user-specified support and confidence levels, respectively
Algorithm: Apriori algorithm fp-tree frequency set algorithm apriority algorithm: Its core is based on the two-stage frequency set idea recursive method. The association rule belongs to single-dimension, single-Layer and Boolean association rules in classification. In this case, all itemsets with support degrees greater than the minimum support are called frequent itemsets, or frequency sets. The basic idea of the algorithm is to find out all the frequency sets first, and these itemsets are at least as frequent as the predefined minimum support. The strong association rules are then generated by the frequency set, which must satisfy the minimum support and minimum confidence level. Then use the frequency set found in step 1th to produce the desired rule, producing all rules that contain only the items of the collection, where there is only one item on the right of each rule, and the definition of the middle rule is used here. Once these rules are generated, only those rules that are greater than the minimum confidence given by the user are left. In order to generate all the frequency sets, a recursive method is used.
The possibility of generating a large number of candidate sets, as well as the potential need for a duplicate scan of the database, are two major drawbacks of the Apriori algorithm.
is a method of discovering frequent itemsetsIf an item set is frequent, then all of his own is also frequenttip:1/generate a list of all individual items in the project set: support is above the minimum support level2. Scan the transaction to see which itemsets meet the minimum support requirements, remove 3 from the non-satisfied itemsets, and generate all the trusted association rules from the frequent project set, with confidence greater than the minimum confidence level. Combine the remaining sets to ascend a set of two elements4. Rescan transactions, remove itemsets that do not meet the minimum support, repeat until all itemsets are removed fp-tree frequency set algorithm, J. Han has proposed a method that does not produce candidate mining frequent itemsets using a divide-and-conquer strategy, after the first scan, The frequency set in the database is compressed into a frequent pattern tree (Fp-tree), while the associated information is retained, then the fp-tree is differentiated into some condition libraries, each library is correlated with a frequency set of length 1, and then the condition libraries are excavated separately.
Development direction of association Rules: extending the Classical association rules can solve the problem, improve the efficiency of the Classical association rule Mining algorithm and the rule interest. Learn Getting Started http://www.36dsj.com/archives/14243
Association rules-web Data Mining Learning 2