Association Analysis (Association Rule Learning): Looking for hidden relationships between objects from a large-scale data set,
Apriori algorithm : A frequent term algorithm for mining Association rules, the core of which is to excavate frequent itemsets by the generation of candidate sets and the down-closing detection of episodes, which is the most influential algorithm for mining Boolean Association rules frequent sets.
Aprior Algorithm Disadvantage : ① may produce a large number of candidate sets; ② may need to scan the database repeatedly.
frequent itemsets: A collection of items that often appear in a piece
Association Rules : implies that there may be a strong relationship between these two items
The degree of support for an item set : The proportion of records in the dataset that contain the set of items, and the degree of support for the itemsets.
Confidence level: defined for an association rule such as {diaper}-{wine}, the confidence of this rule can be defined as "support ({diaper, wine})/support ({diaper})"
Support and confidence are the methods used to quantify the success of correlation analysis
Apriori principle: If an item set is frequent, then all its subsets are frequent, and if an item set is non-frequent, anemia all its superset is infrequent. Use this principle to avoid exponential growth of the number of itemsets, and to calculate frequent itemsets within a reasonable time.
This article from "Shangwei Super" blog, declined reprint!
--------Apriori algorithm for machine learning combat