1. Support Level
The support degree represents the probability that the itemsets {x, y} appear in the total itemsets. The formula is:
Support (X→Y) = P (x, y)/P (i) = P (x∪y)/p (i) = num (xuy)/num (i)
Where I represents the total set of items. Num () indicates the number of itemsets
2. Confidence level (Confidence)
The confidence level indicates the probability of Y being rolled out by the association rule "X→y" in the case where prerequisite X occurs. That is, in the set of items containing x, the likelihood of having Y, the formula is:
Confidence (x→y) = P (y| x) = p (x, y)/p (×) = P (xuy)/p (x)
3. Lifting degree (Lift)
The degree of Ascension represents the probability of having a y at the same time as the condition containing x, and the ratio of the probability of having y to a condition that does not contain x.
Lift (x→y) = P (y| X)/P (Y)
Example 1, it is known that 1000 customers buy the Spring Festival, divided into two groups, each group of 500 people, including a group of 500 people bought tea, while 450 people bought coffee; group 450 bought coffee, as shown in table (1):
Table (1) Shopping list
Try to solve 1) "Tea → coffee" support degree
2) The confidence level of "tea → coffee"
3) Promotion of "Tea → coffee"
Analysis:
Set x= {buy tea},y={buy coffee}, then the rules "Tea → coffee" means "that is bought tea, and bought coffee", so, "tea → coffee" of the support degree for
Support (x→y) = 450/500 = 90%
The confidence level of "tea → coffee" is
Confidence (x→y) = 450/500 = 90%
"Tea → coffee" to promote the degree of
Lift (x→y) = Confidence (x→y)/P (Y) = 90%/((450+450)/1000) = 90%/90% = 1
Because the lift degree lift (x→y) = 1, the x and Y are independent of each other, that is, if there is X, there is no effect on the appearance of Y. That is, whether to buy coffee, and there is no link to buy tea. The rule "Tea → coffee" is not established, or the relevance is very small, almost no, although its support and confidence of up to 90%, but it is not a strong association rules.
Support, confidence, and lift in correlation analysis