Proof of project problems in the Expires header of association rules with multiple minimum support levels

Source: Internet
Author: User

In data miningAlgorithmIt is a widely used association rule mining algorithm. The single-minimum-support algorithm and multi-minimum-support algorithm can be considered as a special case of the Multi-minimum-support algorithm. In practical applications, the use frequency of the Multi-minimum support algorithm is relatively high. In many books about data mining, we have provided detailed false data of the MS-Apriori algorithm.CodeIn this algorithm, we only record the support count of each frequent project set. However, in the process of generating association rules, it is not enough to rely solely on the support count of frequent project sets. This leads to the so-called Header project issue. Let's give a simple example to solve this problem:

Eg: MIS (bread) = 2%, MIS (clothes) = 0.2%, MIS (shoes) = 0.1%. Project set {clothes, breads} real support is 0.15%, {clothes, shoes, breads} real support is 0.12%. According to the MS-Apriori algorithm{Clothes, bread} is not a frequent project set,{Clothes, shoes, and breads} are a frequent project set. Therefore, the support count of the former is not saved, and the support count of the latter is saved.

We cannot calculate the confidence level of the Rule {clothes, breads --> Shoes, breads}, {breads --> Clothes, shoes, because {clothes} and {bread} may not be frequent project sets.

We define head-item problem: when a project with the smallest MIS value in a frequent project set is behind a rule, we may not be able to calculate the confidence level of this rule.

Finally, we use the reverse Identification Method in mathematics to prove the problem of this Header project: set F to a frequent project set, and set a to a project with the smallest MIS value in F (A is called a header project ). According to the definition, we can see that MIS (f) = MIS (). Now we need to prove that such a rule X-> Y, where xuy = f, x ^ y = NULL, and a belongs to X, this rule also has a header project issue. Assuming that MIS (x) = MIS (A), X is also a frequent project set, and the support count of X is retained. F is a frequent project set, the support count of F is also retained, so the confidence level of X-> Y can be directly calculated. Therefore, when a is a rule's prefix, no project casting problem will occur.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.