Fundamentals of Machine Learning (IV.) Logistic regression

Source: Internet
Author: User

From this section is beginning to enter the "normal" machine learning, the reason is "formal" because it began to establish value function (cost function), then optimize the value function to find the weight, and then test the validation. The whole process of machine learning must be through the link. The topic to study today is logistic regression, and logistic regression is also a kind of supervised learning method (supervised machine learning). Logical regression is generally used to make predictions, and can also be used to classify, predict is a category ^.^! Linear regression is no stranger than everyone else, y=kx+b, given a bunch of data points, to fit the values of K and B, the next time given x, you can calculate the Y, this is regression. And the logical regression is a bit different from this, it's a non-linear function, the fitting function is quite strong, and it is a continuous function, it can be derivative, this is very important, if a function is not derivative, then it is difficult to use in machine learning, The early Hevy (Heaviside) ladder function was replaced by the sigmoid function, because the derivative means that we can quickly find its extreme point, which is one of the important ideas of the optimization method: using derivation to get the gradient and then updating the parameters with the gradient descent method.

Here's a look at the sigmoid function of the logical regression, as shown in (Figure I):

(Figure I)

(figure I) above is the shape of the sigmoid function on the definition field [ -5,5], and the following figure is the shape on the definition field [ -60,60], which, as you can see from these two graphs, is more suitable for the return of two classes because of the severity of the two level differentiation. sigmoid function as shown in (Formula One):

(Formula I)

Now that we have two types of regression function models, we can map the features to this model, and the sigmoid function has only one z, assuming that our feature is X=[X0,X1,X2...XN]. Make

, when given a large number of training samples feature X, we just find the appropriate W=[W0,W1,W2...WN] to correctly map each sample feature X to the sigmoid function of level two, which means that the correct completion of the class regression is OK, then a test sample, as long as the weight multiplied, The value that is calculated with the sigmoid function is the predicted value, very simple. Then how to ask for weight w?

To calculate w, it is necessary to enter the optimal solution phase, using gradient descent method or random gradient descent method. When it comes to gradient descent, what is the gradient in general? The gradient is the fastest direction for a function to rise, and we can quickly find the extreme point along the gradient direction. What's the extreme value we're looking for? Think carefully, of course, is to find the error extremum of the training model, when the error and the minimum between the predicted value and the correct value of the training sample, the model parameter is our request. Of course, the smallest error is likely to lead to the fitting, this later. We first establish the Model training error value function (cost function), as shown in (Formula II):

(Formula Two)

(Formula Two) Y represents the true value of training samples, when the minimum of J (Theta) Theta is the weight of the model we require, we can see that J (Theta) is a convex function, the minimum value is the global minimum. The gradient is derived after its derivation, as shown in (Formula III):

(Formula Three)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.