Data Mining concepts and technologies chapter 2 (6th) Mining of Large Databases

Source: Internet
Author: User
Project set: the set of items, which is recorded as I. Milk, bread, apple, etc transaction: Transaction T is a set of projects in I, each transaction has a TID as the identifier. Project set X support Sup (X) Count (X) | D | Associate Rule X-Y support Sup (X-Y) Count (X-Y) | D | Associate Rule X-Y confidence level Conf

Project set: the set of items, which is recorded as I. Milk, bread, apple, etc: Transaction T is a set of projects in I. Each transaction has a TID as the identifier. Project set X support Sup (X) = Count (X)/| D | Associate Rule X-Y support Sup (X-Y) = Count (X-Y) /| D | confidence level Conf of the association rule X-Y

Project set: A set of items, which is counted as I. Milk, bread, apple, etc

Things ServicesTransaction T is a collection of projects in I. Each transaction has a TID as the identifier.

Project setXSupported

Sup (X) = Count (X)/| D |

Association RulesX-> YSupported

Sup (X-> Y) = Count (X-> Y)/| D |

Association RulesX-> YConfidence Level

Conf (X-> Y) = Count (X-> Y)/Count (X)

Steps:

1. Find all frequent item sets

2. Frequent Item sets generate strong Association Rules

AlgorithmAprioriAlgorithm

Core Ideology

Frequent 1-then association rule then pruning
Then frequency 2-then association rule then pruning
Then frequency 3-then association rule then pruning
......

Repeat this process until you cannot proceed.

Algorithm ImplementationWrite it separately later

Improvement MeasuresFP (Frequent Pattern Growth) Frequent mode Growth

Multi-layer Association Rules

Basic Idea:

Search for frequent item sets at each conceptual layer from top to bottom. Then ① explore association rules at this layer ② appropriate pruning, and go down to a more specific layer.

Optional solution:

1. Use consistent minimum support

2. Use the minimum support degree of decline

3 independent layer by layer

Layer-4 cross-item Filtering

5. Cross k-item set filtering

Multidimensional Association Rules

Association rules involving two or more dimensions

E.g. age (X, "IBM desktop computer") ^ occupation ("drive" r) => buys (X, "laptop ")

Attention to Multidimensional Association Rules

Strong association rules are not necessarily interesting. For example,

Buys (X, "computer games") => buys (X, "videos") [Support = 40%, confidence = 66%]

The derived results may meet the support and confidence level, but they are not interesting.

In fact, there may be 75% video users who do not purchase computer games, but only 66% video users who have bought computer games and videos at the same time, which indicates that computer games and videos are negatively correlated.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.