Support, confidence, and lift in correlation analysis

Source: Internet
Author: User

Reprinted from: http://m.blog.csdn.net/blog/sanqima/42746419

1. Support Level

The support degree represents the probability that the itemsets {x, y} appear in the total itemsets. The formula is:

Support (X→Y) = P (x, y)/P (i) = P (x∪y)/p (i) = num (xuy)/num (i)

Where I represents the total transaction set. Num () indicates the number of occurrences of a particular set of items in the transaction set.

For example, num (I) indicates the number of total transaction sets

Num (x∪y) indicates the number of transaction sets with {x, y} (number of calls).

2. Confidence level (Confidence)

The confidence level indicates the probability of Y being rolled out by the association rule "X→y" in the case where prerequisite X occurs. That is, in the set of items containing x, the likelihood of having Y, the formula is:

Confidence (x→y) = P (y| x) = p (x, y)/p (×) = P (xuy)/p (x)

3. Lifting degree (Lift)

The degree of Ascension represents the probability of having a y at the same time as the condition containing x, and the ratio of the probability of having y to a condition that does not contain x.

Lift (x→y) = P (y| X)/P (Y)

Example 1, it is known that 1000 customers buy the Spring Festival, divided into two groups, each group of 500 people, including a group of 500 people bought tea, while 450 people bought coffee; group 450 bought coffee, as shown in table (1):

Table (1) Shopping list

Try to solve 1) "Tea → coffee" support degree

2) The confidence level of "tea → coffee"

3) Promotion of "Tea → coffee"

Analysis:

Set x= {buy tea},y={buy coffee}, then the rules "Tea → coffee" means "that is bought tea, and bought coffee", so, "tea → coffee" of the support degree for

Support (x→y) = 450/500 = 90%

The confidence level of "tea → coffee" is

Confidence (x→y) = 450/500 = 90%

"Tea → coffee" to promote the degree of

Lift (x→y) = Confidence (x→y)/P (Y) = 90%/((450+450)/1000) = 90%/90% = 1

Because the lift degree lift (x→y) = 1, the x and Y are independent of each other, that is, if there is X, there is no effect on the appearance of Y. That is, whether to buy coffee, and there is no link to buy tea. The rule "Tea → coffee" is not established, or the relevance is very small, almost no, although its support and confidence is up to 90%, but it is not an effective association rules.

Rules that satisfy minimum support and minimum confidence are called "strong Association Rules." However, in strong association rules, there are also strong association rules and invalid strong association rules.

If lift (x→y) >1, then the rule "x→y" is a valid strong association rule.

If lift (x→y) <=1, then the rule "x→y" is an invalid strong association rule.

In particular, if lift (x→y) = 1, it means that X and Y are independent of each other.

Support, confidence, and lift in correlation analysis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.