Logistic Regression Introduction

Logistic Regression Introduction _ Forecast

Last Update:2018-08-23 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1, the main idea of linear regression is to fit a straight line through historical data, and use this line to predict new data. (For example: The A.B class is located on both sides of a linear function)

2, there are many factors in the real world, so we need to use multivariate linear function to describe an event (result)

3. Multivariate linear function: A multivariable analysis of the relationship between the two classification observations and some influential factors (x1,x2,x3,..., xn), for example, in medicine, according to some of the patient's symptoms to determine whether it suffers from a disease.

4. Multivariate Linear regression formula:

5, sigmoid function:

By bringing the multivariate linear function z into the sigmoid function, we get the generalized linear regression model

6, the function output of sigmoid is between (0,1), the median is 0.5, so we can consider the sigmoid function as the probability density function of sample data
Because the hθ (x) output is between (0,1), it also indicates that the data belongs to a certain kind of probability, for example:
hθ (x) <0.5 indicates that the current data belongs to Class A
hθ (x) >0.5 indicates that the current data belongs to Class B

7. How to use generalized linear regression model

Considering the vector x= (x1,x2,x3,..., xn) with n independent variables, the conditional rate P (y=1| X) = P is the probability of the occurrence of an event relative to the observed amount. Then the logistic regression model can be expressed as

So the ratio of the occurrence of the event to the probability of not occurring is

This ratio is called the occurrence ratio of the event, the logarithm of which is obtained

If there are m observation samples, the observed values are Y1,y2,y3,... ym, and pi = P (yi = 1| xi) is the probability of yi=1 under given conditions, then the probability of yi=0 is P (yi = 0 | xi) = 1-PI, so the probability of obtaining a set of observations is

Because each observation sample is independent of each other, their joint distribution is the product of each edge distribution. Get the likelihood function

Then our goal is to find the maximum parameter estimation of the likelihood function, the maximum likelihood estimate is to find the parameter w0,w1,w2,w3,... WN, so that L (W) gets the maximum value, and the function L (w) is taken logarithm

The final deformation is

Where Yi is the true value

is the forecast value

8, to determine the optimal regression coefficient of the process, that is, the data set training process 4.
The steps to find the best regression coefficients are as follows:
1. List classification functions: When H (x) > 0 is Class A, H (x) < 0 is class B

(Theta refers to the regression coefficient, in practice, the result will often be a sigmoid conversion)
2. Give the error estimate function corresponding to the classification function:

(M is the number of samples)
This theta vector is the best regression coefficient vector only if a theta vector makes the above error estimate function J (θ) the minimum value.
3. The value of theta when using gradient descent method or least squares to obtain the minimum value of the error function:

Last state and previous state
For the convenience of presentation, the case of the upper-type is only one sample, in practice, a sum of multiple samples needs to be combined (unless you use the random gradient rise algorithm that follows), the error function in step 2 is added to the minus sign, so the problem can be converted to a maximum value, and the gradient descent method is converted to a gradient rise method.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Logistic Regression Introduction _ Forecast

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Logistic Regression Introduction _ Forecast

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support