Logistic regression model (Regression) and Python implementation
Http://www.cnblogs.com/sumai
1. Model
In classification problems, such as whether the message is spam, to determine whether the tumor is positive, the target variable is discrete, only two values, usually encoded as 0 and 1. Suppose we have a feature x that plots a scatter plot, and the results are as follows. At this time if we use linear regression to fit a line: hθ (X) =θ0+θ1x, if y≥0.5 is judged to be 1, otherwise 0. So we can also build a model to classify, but there are many shortcomings, such as poor robustness, low accuracy. and logistic regression is more appropriate for this kind of problem.
The logical regression hypothesis function is as follows, it makes a function G transform to θtx , maps to a range of 0 to 1, and the function g is called sigmoid function or logistic function, and the function image is as shown. When we enter features, the resulting hθ (x) is actually the probability value of this sample that belongs to the 1 classification. In other words, logistic regression is used to get the probability that a sample belongs to a classification.
2. Evaluation
Recall the loss function used in the previous linear regression:
If this loss function is also used in logistic regression, the obtained function j is a non-convex function, there are many local minimum values, it is difficult to solve, so we need to change the cost function. Redefine the cost function as follows:
When the actual sample belongs to the 1 category, if the predicted probability is also 1, then the loss is 0 and the prediction is correct. Conversely, if the forecast is 0, then the loss will be infinite. The loss function of this structure is reasonable, and it is a convex function, so it is very convenient to obtain the parameter θ, so that the loss function J reaches the minimum.
3. Optimization
We have defined the loss function J (θ), and the next task is to find the parameter θ. Our goal is clear: to find a set of θ, so that our loss function J (θ) is the smallest. There are two most commonly used solutions: Batch gradient descent (batch gradient descent), Newton iterative (Newton's method). Both methods are numerical solutions obtained by iteration, but the convergence speed of Newton iterative method is faster.
Batch Gradient descent method:
Newton Iterative Method: (H is the heather matrix)
4.python Code Implementation
1 #-*-coding:utf-8-*-2 """3 Created on Wed Feb 11:04:114 5 @author: Sumaiwong6 """7 8 ImportNumPy as NP9 ImportPandas as PDTen fromNumPyImportDot One fromNumpy.linalgImportINV A -Iris = Pd.read_csv ('D:\iris.csv') -Dummy = pd.get_dummies (iris['species'])#creating dummy variables for species theIris = Pd.concat ([Iris, dummy], axis =1 ) -Iris = iris.iloc[0:100,:]#intercept the first 100 rows of samples - - #build a logistic Regression to classify whether the species is Setosa setosa ~ sepal.length + #Y = g (BX) = 1/(1+exp (-BX)) - deflogit (x): + return1./(1+np.exp (-x)) A attemp =PD. DataFrame (iris.iloc[:, 0]) -temp['x0'] = 1. -X = temp.iloc[:,[1, 0]] -Y = iris['Setosa'].reshape (len (Iris), 1)#sorting out the X-matrix and y-matrices - - #Batch Gradient descent method inM,n = X.shape#Matrix Size -Alpha = 0.0065#Set the learning rate toTheta_g = Np.zeros ((n,1))#Initialize Parameters +Maxcycles = 3000#Number of iterations -J = PD. Series (Np.arange (maxcycles, dtype = float))#loss Function the * forIinchRange (maxcycles): $h = logit (dot (X, theta_g))#Estimated ValuePanax NotoginsengJ[i] =-(1/100.) *np.sum (Y*np.log (h) + (1-y) *np.log (1-h))#Calculate loss function Values -Error = H-y#Error theGrad = dot (x.t, error)#Gradient +Theta_g-= Alpha *Grad A PrintTheta_g the PrintJ.plot () + - #Newton Method $Theta_n = Np.zeros ((n,1))#Initialize Parameters $Maxcycles = 10#Number of iterations -C = PD. Series (Np.arange (maxcycles, dtype = float))#loss Function - forIinchRange (maxcycles): theh = logit (dot (X, theta_n))#Estimated Value -C[i] =-(1/100.) *np.sum (Y*np.log (h) + (1-y) *np.log (1-h))#Calculate loss function ValuesWuyiError = H-y#Error theGrad = dot (x.t, error)#Gradient -A = h* (1-h) *Np.eye (len (X)) WuH = Np.mat (x.t) * A * Np.mat (X)#Heather Matrix, H = X ' AX -Theta_n-= INV (H) *Grad About PrintTheta_n $ PrintC.plot ()
Data used by the code: Http://files.cnblogs.com/files/sumai/iris.rar
Logistic regression model (Regression) and Python implementation