[Machine Learning] personal understanding about Logistic Regression

Source: Internet
Author: User

1. Use it for classification, not for Regression

Your goal is to set the labels 0, 1 on the test data table.

2. A case

We have a bunch of data, assuming there is only one feature, and the tumor size tumorsize. We want to determine whether it is benign (benign, 0) or malignant (malignant, 1 ).

The data is assumed as follows:

We can use a linear function h (x) to divide this space. One side is benign and the other side is malignant.

, X0 = 1.

The coefficient ω can be obtained through gradient descent.


Due to the influence of an odd dollar instance on the rightmost side, the straight line we get is likely to be like this. For many instances in the training set, the wrong side is located. It is not always a good idea to use the linear model for classification.

Furthermore, the value of h (x) may be greater than 1, which may be less than 0. We want 0 <= h (x) <= 1.

Therefore, we introduced the sigmoid function.

3. Sigmoid Function

 

The sigmoid function is a function that defines the entire real number field and the value field is (0, 1. When X-> + infinity, sig (x)-> 1; when X->-infinity, sig (x)-> 0.

Order

In fact, the output of this function can be viewed as P (y = 1 | X, ω ). If y =-1 and y = 1 are output:


That is:

The image of the former is that the image of the latter is symmetric.

We have a new hypothesis. The output is between (0, 1). When h '(x)> 0.5, we think the tumor is malignant (1 ), when h '(x) <0.5 is benign. When h '(x) = 0.5, the result is random.

4. Decoding Algorithm

For logistic regression without regularization, we can use the gradient descent algorithm to minimize a negative log-likelihood:




The above Code adds a prior probability and implements regularization. Assume that ω follows a normal distribution. The reason for regularization is to prevent overfitting.

Take the above as the minimum goal and use gradient descent to obtain the minimum ω '. This is the final parameter of the trained model.

For new data x', we use the new f (x') = H' (x', ω '). If f (x')> 0.5, the prediction result is malignant. If f (x') <0.5, the prediction result is benign. At the same time, this function value is also the probability that the result is malignant.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.