Project set: the set of items, which is recorded as I. Milk, bread, apple, etc transaction: Transaction T is a set of projects in I, each transaction has a TID as the identifier. Project set X support Sup (X) Count (X) | D | Associate Rule X-Y support Sup (X-Y) Count (X-Y) | D | Associate Rule X-Y confidence level Conf
Project set: the set of items, which is recorded as I. Milk, bread, apple, etc: Transaction T is a set of projects in I. Each transaction has a TID as the identifier. Project set X support Sup (X) = Count (X)/| D | Associate Rule X-Y support Sup (X-Y) = Count (X-Y) /| D | confidence level Conf of the association rule X-Y
Project set: A set of items, which is counted as I. Milk, bread, apple, etc
Things ServicesTransaction T is a collection of projects in I. Each transaction has a TID as the identifier.
Project setXSupported
Sup (X) = Count (X)/| D |
Association RulesX-> YSupported
Sup (X-> Y) = Count (X-> Y)/| D |
Association RulesX-> YConfidence Level
Conf (X-> Y) = Count (X-> Y)/Count (X)
Steps:
1. Find all frequent item sets
2. Frequent Item sets generate strong Association Rules
AlgorithmAprioriAlgorithm
Core Ideology
Frequent 1-then association rule then pruning
Then frequency 2-then association rule then pruning
Then frequency 3-then association rule then pruning
......
Repeat this process until you cannot proceed.
Algorithm ImplementationWrite it separately later
Improvement MeasuresFP (Frequent Pattern Growth) Frequent mode Growth
Multi-layer Association Rules
Basic Idea:
Search for frequent item sets at each conceptual layer from top to bottom. Then ① explore association rules at this layer ② appropriate pruning, and go down to a more specific layer.
Optional solution:
1. Use consistent minimum support
2. Use the minimum support degree of decline
3 independent layer by layer
Layer-4 cross-item Filtering
5. Cross k-item set filtering
Multidimensional Association Rules
Association rules involving two or more dimensions
E.g. age (X, "IBM desktop computer") ^ occupation ("drive" r) => buys (X, "laptop ")
Attention to Multidimensional Association Rules
Strong association rules are not necessarily interesting. For example,
Buys (X, "computer games") => buys (X, "videos") [Support = 40%, confidence = 66%]
The derived results may meet the support and confidence level, but they are not interesting.
In fact, there may be 75% video users who do not purchase computer games, but only 66% video users who have bought computer games and videos at the same time, which indicates that computer games and videos are negatively correlated.