Coursera Machine Learning Study notes (13)

Source: Internet
Author: User

Vi. Logistic Regression (Week 3)

-Classification

In the classification problem, what we try to predict is whether the result belongs to a certain class (for example, correct or error). Examples of classification problems include determining whether an e-mail message is spam, determining whether an e-mail message is spam, and judging whether a financial transaction is fraudulent or not.

We start with a two-dollar classification issue.

We refer to the two classes that the dependent variable (dependent variable) might belong to, namely the negative class (negative class) and the forward class (positive Class), then the dependent variable

-Hypothesis representation

Below, we review the first mentioned breast cancer classification problem, we can use the linear regression method to find a suitable line of data.

We can only predict successive values according to the linear regression model, but for the classification problem we need to output 0 or 1, we could predict:

For the data shown, such a linear model seems to do a good job of classifying tasks. However, suppose we observe a very large malignant tumour and add it as an example to our training focus, which will allow us to get a new line.

At this point, it is inappropriate to use 0.5 as a threshold to predict whether a tumour is benign or malignant. As can be seen, the linear regression model, because its predicted value can go beyond the [0,1] range, is not suitable for solving such problems.

So, we introduce a new model, the logistic regression, the output variable range of the model is always between 0 and 1.


The effect of this is that for a given input variable, the probability of the output variable =1 (estimated probability) is calculated based on the selected parameters, i.e.

For example, if a given x is calculated by an already determined parameter, it indicates a 70 chance Y is a forward class, and the corresponding y is a negative class with a probability of 1-0.7 = 0.3.

-Decision Boundary

In logistic regression, we predict:

Based on the image of the S-shape function that was previously drawn, we know when:

Again, namely:

For example, suppose we have a model: and the parameter is a vector [-3 1 1].

Then when greater than or equal to 0, which is greater than or equal to 3 o'clock, the model predicts y = 1.

We can draw a straight line, which is the dividing line of our model, separating the area predicted to 1 and the area predicted as 0.

What kind of model would be appropriate if our data were to be presented in the following circumstances?

Because curves are required to separate areas of y = 0 and y = 1, we need two-character:

Assuming that the parameter is [-1 0 0 1 1], then we get the decision boundary is exactly the circle point at the origin and the radius is 1.

We can adapt a very complex model to the decision boundary of very complex shapes.

Coursera Machine Learning Study notes (13)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.