Logistic regression and Python implementation

Last Update:2014-10-14 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Theoretical knowledge Section:

The hypotheses function of Logistic Regression

In linear regression, if we assume that the variable y to be predicted is a discrete value, then this is the classification problem. If Y can only take 0 or 1, this is the problem with binary classification. We can still consider using regression method to solve the problem of binary classification. But at this point, since we already know y \in {0,1}, rather than the entire real field R, we should modify the form of the hypotheses function H_\theta (x) and use the logistic function to map any real number to the range of [0,1]. That

Where we have a linear combination of all feature first, i.e. \theta ' * x = \theta_0 * x_0 + \theta_1 * x_1 +\theta_2 * X_2 ..., then the linear combination of values into the logistic Function (also called sigmoid function) is mapped to a value within [0,1]. The image of the Logistic function is as follows

When z-> is infinity, the function value->1; when the z-> negative infinity, the function value->0. Therefore the new hypotheses function H_\theta (x) is always within the [0,1] interval. We also add a feature x_0 = 1 to facilitate the vector representation. The derivative of the Logistic function can be represented by the original function, i.e.

This conclusion will be used when learning the parameter \theta later.

2 study on model parameters of logistic regression with maximum likelihood estimation and gradient ascent method \theta

Given the new hypotheses function H_\theta (x), how do we learn the parameter \theta based on the training sample? We can consider using the maximum likelihood estimator (MLE) to fit data (MLE equivalent to the minimized cost function in the LMS algorithm) from the perspective of probabilistic assumptions. We assume that:

That is, using the hypotheses function H_\theta (x) to represent the probability of Y=1, 1-h_\theta (x) to represent the probability of y=0. This probability hypothesis can be written in a more compact form as follows

Suppose we observe the M training samples and their generation processes are independent and distributed, then we can write the likelihood function

Take the logarithm and turn it into Log-likelihood.

We now want to maximize Log-likelihood parameter \theta. In other words, when cost function J =-L (\theta), we need to minimize the cost function--l (\theta).

Similar to the gradient descent method we used to learn the linear regression parameter, we can use the gradient ascent method to maximize the Log-likelihood, assuming we have only one training sample (x, y), then we can get an update of the SGA (incremental gradient rise) Rule

It uses the properties of the derivative of the logistic function that is g ' = g (1-g). So we can get the parameter update rule

Here is the constant addition of a quantity, as is the gradient rise. \alpha is learning rate. The parameter LMS update rule is the same as the formal view and linear regression, but the substance is different, so the hypothetical model function H_\theta (x) is different. In the linear regression is only a linear combination of all feature, in the logistic regression, all feature linear combinations are first combined and then mapped into the interval [0,1] with the logistic function, that is, at this time H_ \theta (x) is no longer a linear function. In fact, these two algorithms are generalized Linear models special case.

Python implementation section (from the Machine Learning Combat Chapter fifth):

From NumPy import *def loaddataset (): datamat=[]; Labelmat=[] fr = open (' testSet.txt ') for line in Fr.readlines (): Linearr = Line.strip (). Split () Datama T.append ([1.0,float (linearr[0]), float (linearr[1])] labelmat.append (int (linearr[2))) return Datamat,labelmatdef  Sigmoid (InX): Return 1.0/(1+exp (-inx)) def gradascent (datamatin,classlabels): Datamatrix = Mat (datamatin) Labelmat     = Mat (Classlabels). Transpose () M,n = shape (datamatrix) Alpha = 0.001 maxiteration = weights = Ones ((n,1)) for k in range (maxiteration): h = sigmoid (Datamatrix * weights) Error = (labelmat-h) weights =  Weights + Alpha * datamatrix.transpose () * Error return Weightsdef plotbestfit (weights): Import Matplotlib.pyplot as PLT Datamat,labelmat=loaddataset () Dataarr = Array (datamat) n = shape (Dataarr) [0] xcord1 = []; Ycord1 = [] Xcord2 = []; Ycord2 = [] for i in range (n): # get X, y locate at Xcord Ycord if inT (labelmat[i]) = = 1:xcord1.append (dataarr[i,1]); Ycord1.append (dataarr[i,2]) else:xcord2.append (dataarr[i,1]); Ycord2.append (dataarr[i,2]) FIG = plt.figure () ax = Fig.add_subplot (111) ax.scatter (Xcord1, Ycord1, s=30, c= ' red ' , marker= ' s ') Ax.scatter (Xcord2, Ycord2, s=30, c= ' green ') x = Arange ( -3.0, 3.0, 0.1) y = (-weights[0]-weights[1]* x)/weights[2] Ax.plot (x, y) plt.xlabel (' X1 ');    Plt.ylabel (' X2 ');  Plt.show () if __name__ = = "__main__": Dataarr,labelmat = Loaddataset () print (Dataarr) print (labelmat) weights = Gradascent (Dataarr, Labelmat) print (weights) Plotbestfit (Weights.geta ())

The most important step in the Gradascent function is the derivation of the theta iteration, the preceding formula. Plotbestfit draw a decision boundary.

Logistic regression and Python implementation

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Logistic regression and Python implementation

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Logistic regression and Python implementation

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support