Logistic regression and Python implementation

Source: Internet
Author: User

Theoretical knowledge Section:

The hypotheses function of Logistic Regression

In linear regression, if we assume that the variable y to be predicted is a discrete value, then this is the classification problem. If Y can only take 0 or 1, this is the problem with binary classification. We can still consider using regression method to solve the problem of binary classification. But at this point, since we already know y \in {0,1}, rather than the entire real field R, we should modify the form of the hypotheses function H_\theta (x) and use the logistic function to map any real number to the range of [0,1]. That


Where we have a linear combination of all feature first, i.e. \theta ' * x = \theta_0 * x_0 + \theta_1 * x_1 +\theta_2 * X_2 ..., then the linear combination of values into the logistic Function (also called sigmoid function) is mapped to a value within [0,1]. The image of the Logistic function is as follows

When z-> is infinity, the function value->1; when the z-> negative infinity, the function value->0. Therefore the new hypotheses function H_\theta (x) is always within the [0,1] interval. We also add a feature x_0 = 1 to facilitate the vector representation. The derivative of the Logistic function can be represented by the original function, i.e.


This conclusion will be used when learning the parameter \theta later.

2 study on model parameters of logistic regression with maximum likelihood estimation and gradient ascent method \theta

Given the new hypotheses function H_\theta (x), how do we learn the parameter \theta based on the training sample? We can consider using the maximum likelihood estimator (MLE) to fit data (MLE equivalent to the minimized cost function in the LMS algorithm) from the perspective of probabilistic assumptions. We assume that:


That is, using the hypotheses function H_\theta (x) to represent the probability of Y=1, 1-h_\theta (x) to represent the probability of y=0. This probability hypothesis can be written in a more compact form as follows


Suppose we observe the M training samples and their generation processes are independent and distributed, then we can write the likelihood function


Take the logarithm and turn it into Log-likelihood.


We now want to maximize Log-likelihood parameter \theta. In other words, when cost function J =-L (\theta), we need to minimize the cost function--l (\theta).

Similar to the gradient descent method we used to learn the linear regression parameter, we can use the gradient ascent method to maximize the Log-likelihood, assuming we have only one training sample (x, y), then we can get an update of the SGA (incremental gradient rise) Rule


It uses the properties of the derivative of the logistic function that is g ' = g (1-g). So we can get the parameter update rule


Here is the constant addition of a quantity, as is the gradient rise. \alpha is learning rate. The parameter LMS update rule is the same as the formal view and linear regression, but the substance is different, so the hypothetical model function H_\theta (x) is different. In the linear regression is only a linear combination of all feature, in the logistic regression, all feature linear combinations are first combined and then mapped into the interval [0,1] with the logistic function, that is, at this time H_ \theta (x) is no longer a linear function. In fact, these two algorithms are generalized Linear models special case.

Python implementation section (from the Machine Learning Combat Chapter fifth):

From NumPy import *def loaddataset (): datamat=[]; Labelmat=[] fr = open (' testSet.txt ') for line in Fr.readlines (): Linearr = Line.strip (). Split () Datama T.append ([1.0,float (linearr[0]), float (linearr[1])] labelmat.append (int (linearr[2))) return Datamat,labelmatdef  Sigmoid (InX): Return 1.0/(1+exp (-inx)) def gradascent (datamatin,classlabels): Datamatrix = Mat (datamatin) Labelmat     = Mat (Classlabels). Transpose () M,n = shape (datamatrix) Alpha = 0.001 maxiteration = weights = Ones ((n,1)) for k in range (maxiteration): h = sigmoid (Datamatrix * weights) Error = (labelmat-h) weights =  Weights + Alpha * datamatrix.transpose () * Error return Weightsdef plotbestfit (weights): Import Matplotlib.pyplot as PLT Datamat,labelmat=loaddataset () Dataarr = Array (datamat) n = shape (Dataarr) [0] xcord1 = []; Ycord1 = [] Xcord2 = []; Ycord2 = [] for i in range (n): # get X, y locate at Xcord Ycord if inT (labelmat[i]) = = 1:xcord1.append (dataarr[i,1]); Ycord1.append (dataarr[i,2]) else:xcord2.append (dataarr[i,1]); Ycord2.append (dataarr[i,2]) FIG = plt.figure () ax = Fig.add_subplot (111) ax.scatter (Xcord1, Ycord1, s=30, c= ' red ' , marker= ' s ') Ax.scatter (Xcord2, Ycord2, s=30, c= ' green ') x = Arange ( -3.0, 3.0, 0.1) y = (-weights[0]-weights[1]* x)/weights[2] Ax.plot (x, y) plt.xlabel (' X1 ');    Plt.ylabel (' X2 ');  Plt.show () if __name__ = = "__main__": Dataarr,labelmat = Loaddataset () print (Dataarr) print (labelmat) weights = Gradascent (Dataarr, Labelmat) print (weights) Plotbestfit (Weights.geta ())


The most important step in the Gradascent function is the derivation of the theta iteration, the preceding formula. Plotbestfit draw a decision boundary.

Logistic regression and Python implementation

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.