"Machine Learning" (5): Bayesian decision-making

Source: Internet
Author: User

In the previous section, we introduced the overall framework of supervised learning and the basic points, according to the total number of thinking, then we will introduce the corresponding algorithms. Today, let's take a look at the application of Bayesian theorem in machine learning. The main points of this chapter are:

1. Bayes theorem;

2. Bayes theorem in classification;

3. Risk and utility measures;

4. Association rules;


First, Bayes theorem

Bayesian theorem derives from the conditional probability in statistics, it can reveal the correspondence relation between two variables, the basic formula is as follows:

650) this.width=650; "Src=" http://img.blog.csdn.net/20150413163238948?watermark/2/text/ ahr0cdovl2jsb2cuy3nkbi5uzxqvd2luzghhd2tfzmx5/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity/ Center "height=" width= "/>"

where P (c|x) indicates the conditional probability of event C occurring when the data x is observed, we call the posterior probability (posterior probability), and P (C) =p (c=1) is the probability that event c=1 occurs, called a priori probability (prior probabilty), Because this is the knowledge about C that was obtained before the data X was observed; P (x| C) is called similar, in contrast to P (c|x), which indicates the probability that the sample observed in event C is X, and P (x) is the evidence (evidence), which is the edge probability of observing X, namely:

650) this.width=650; "Src=" http://img.blog.csdn.net/20150413164216410?watermark/2/text/ ahr0cdovl2jsb2cuy3nkbi5uzxqvd2luzghhd2tfzmx5/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity/ Center "height=" 113 "width=" 203 "/>

The edge probability here can be understood as the joint probability of X and C, that is, the probability of simultaneous occurrence, the above formula can be obtained by the multiplication principle.


Bayes ' theorem in classification

Bayesian theorem is mainly used to calculate the probability of class in the classification problem, that is, the probability of the observed sample data x belongs to Class C. In general, we can assume that there are k mutually exclusive and exhaustive class set C, the number of elements K, we can get a priori probability to meet:

650) this.width=650; "Src=" http://img.blog.csdn.net/20150413164917880?watermark/2/text/ ahr0cdovl2jsb2cuy3nkbi5uzxqvd2luzghhd2tfzmx5/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity/ Center "height=" "width=" 219 "/>

We can calculate the posterior probabilities of a class based on the observed sample data x, namely:

650) this.width=650; "Src=" http://img.blog.csdn.net/20150413165404137?watermark/2/text/ ahr0cdovl2jsb2cuy3nkbi5uzxqvd2luzghhd2tfzmx5/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity/ Center "height=" width= "357"/>

To minimize the error, the Bayesian classifier (Bayes ' classfier) of course chooses the class with the highest posteriori probability, i.e.:

650) this.width=650; "Src=" http://img.blog.csdn.net/20150413165658065?watermark/2/text/ ahr0cdovl2jsb2cuy3nkbi5uzxqvd2luzghhd2tfzmx5/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity/ Center "height=" width= "351"/>


Third, risk and utility measurement

With Bayesian theorem, we can try to measure the risk in decision-making. For example, we can define the action Α-i represents the decision to assign the input to the class C-i, and Λ-ik represents the loss of the action that was assigned to the class C-i when the class C-k actually belonged, so we can calculate the expected risk of the action α-i (expected risk):

650) this.width=650; "Src=" http://img.blog.csdn.net/20150413170352991?watermark/2/text/ ahr0cdovl2jsb2cuy3nkbi5uzxqvd2luzghhd2tfzmx5/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity/ Center "height=" width= "272"/>

Our goal is to choose the action with the least risk from it. Similarly, we can also define utility functions:

650) this.width=650; "Src=" http://img.blog.csdn.net/20150413170756773?watermark/2/text/ ahr0cdovl2jsb2cuy3nkbi5uzxqvd2luzghhd2tfzmx5/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity/ Center "height=" width= "455"/>

Here, contrary to the risk metric, we seek to make the most effective action α-i.


IV. Rules of Association

Relevance analysis is also a very important aspect of machine learning, in the case of Bayesian theorem applications, where the common "shopping baskets" are used as examples, such as x and Y respectively, to buy two kinds of goods, then we have the following three key measures of their relevance:

1. The confidence level of the association rule X->y (confidence), that is, how much the customer buying x will buy y at the same time:

650) this.width=650; "Src=" http://img.blog.csdn.net/20150413171447334?watermark/2/text/ ahr0cdovl2jsb2cuy3nkbi5uzxqvd2luzghhd2tfzmx5/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity/ Center "height=" width= "444"/>

2. Association rule x->y (lift), also known as interest (interest), is the effect of buying x on buying y:

650) this.width=650; "Src=" http://img.blog.csdn.net/20150413171715132?watermark/2/text/ ahr0cdovl2jsb2cuy3nkbi5uzxqvd2luzghhd2tfzmx5/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity/ Center "height=" width= "385"/>

3. Association rule x->y (support), which indicates the significance of the rule:

650) this.width=650; "Src=" http://img.blog.csdn.net/20150413172114983?watermark/2/text/ ahr0cdovl2jsb2cuy3nkbi5uzxqvd2luzghhd2tfzmx5/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity/ Center "height=" width= "396"/>



All right, let's get here today, we'll continue tomorrow!


Refer:

Introduction to machine learning, Ethen Alpaydin (Turkey), mechanical industry Press


This article is from the "Run Yang Hang" blog, make sure to keep this source http://windhawk.blog.51cto.com/729863/1639552

"Machine Learning" (5): Bayesian decision-making

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.