Logistic regression is a kind of generalized linear regression, and he is a kind of classified analysis method. Logistic is probably one of the most common classification methods. sigmod Function
In logistic, because the variable is two classified variable, a certain probability as the dependent variable estimate value of the equation takes the range of 0 or 1, so we need a function with this property, so the Sigmod function enters our field of vision.
The prototype of the SIGMOD function is:
When x=0, the value of the SIGMOD function is 0.5, and as the thought increases, the value of the SIGMOD function is approximated to 1, and as the x decreases, the value of the SIGMOD function is approximated to 0. principle
In order to achieve the logistic regression classifier, we can multiply each feature by a regression coefficient, and then add all the result values to the SIGMOD function, so we get a range of values between 0-1. Any data greater than 0.5 is divided into 1 categories, and data less than 0.5 is about to be grouped into 0 categories. Gradient Rise Method
To determine the optimal regression coefficients, we need to use the optimization algorithm, the most commonly used is the gradient rise method. The idea is that to find the maximum value of a function, the best way is to follow the gradient direction of the function. If the gradient is recorded as Grad (x,y) or, the gradient of the function f (x,y) is represented:
Grad (x,y) = =
The gradient operator always points to the fastest-growing direction of the function value. The movement of the gradient is called the step size, which is recorded as a. In vector representations, the iterative formula for the gradient algorithm is as follows: The formula is always iterated until a stop condition is reached, such as specifying the number of iterations or the algorithm reaches an allowable error range. Python code implementation sigmod Function Implementation
#-*-Coding:utf-8-*-
__author__ = ' kestiny '
import numpy as NP
import Matplotlib.pyplot as Plt
import R Andom
def sigmod (InX): Return
1.0/(1 + np.exp (-inx))
optimal regression coefficients using gradient-rise iterative derivation
def gradascent (Datamat, Labelmat):
m, n = Np.shape (datamat)
alpha = 0.001 # Specify gradient's step
maxcycles = 500
# Specifies the number of iterations, that is, the iteration termination condition
weights = Np.ones ((n, 1)) # initializes the regression coefficient to 1 for
K in range (Maxcycles):
h = sigmod ( Np.dot (Datamat, Weights)) # matrix multiplication
error = (labelmat-h)
weights = weights + Np.dot (Np.dot (Alpha, datamat.t Ranspose ()), error) # Adjust the regression coefficient in the direction of the error return
weights
Test
def plotbestfit (Weights, Datamat, labelmat): n = datamat.shape[0] Xcord1 = [] Xcord2 = [] Ycord1 = []
Ycord2 = [] for i in range (n): If int (labelmat[i]) = = 1:xcord1.append (datamat[i, 1]) Ycord1.append (Datamat[i, 2]) else:xcord2.append (datamat[i, 1]) ycord2.append (Datamat[i, 2 Fig = plt.figure () ax = Fig.add_subplot () ax.scatter (Xcord1, ycord1, s =, c= ' r ', marker= ' s ') ax.s Catter (Xcord2, Ycord2, s=30, c= ' green ') x = Np.arange ( -5.0, 5.0, 0.1) y = (-weights[0]-weights[1] * x)/Weights [2] Plt.plot (x, y) plt.xlabel (' X1 ') Plt.ylabel ("X2") plt.show () if __name__ = = ' __mai n__ ': File = ' TestSet.txt ' data = np.loadtxt (file, dtype=float, delimiter= ' \ t ', encoding= ' utf-8 ') datamat, Labelmat = Np.split (d ATA, (2,), Axis=1) Cloumns = datamat.shape[0] Datamat = Np.insert (Datamat, 0, Values=np.ones ((1, Cloumns)), Axis=1 ) weights = GradasceNT (Datamat, Labelmat) print (' coefficients: ', weights) plotbestfit (weights, Datamat, Labelmat)
effect
random gradient Rise method
The stochastic gradient ascending method solves the awkward problem that the gradient rise algorithm deals with the large amount of data. Since the gradient rise algorithm needs to traverse the entire dataset every time the regression coefficients are updated, the computational complexity of the gradient rise algorithm is too high once there are hundreds of millions of samples and thousands of features. Therefore, we can improve the regression coefficients by using only one sample point at a time or with a certain number of n (10 or 100) samples, which is called the stochastic gradient rise algorithm. Because the classifier can be updated incrementally when the new sample arrives, the stochastic gradient rise algorithm is an online learning algorithm. Python code implementation random gradient rise algorithm
def stocgradascent (Datamat, labelmat,times=100):
m, n = Np.shape (datamat)
weights = Np.ones (n) for
J in Range (times):
dataindex = Range (m) to
I in range (m):
alpha = 4/(1.0 + j + i) + 0.01
randindex = Int (ra Ndom.uniform (0, Len (dataindex)))
h = sigmod (Np.sum (Np.dot (datamat[randindex), weights))
error = labelmat[ Randindex]-h
weights = weights + Alpha * ERROR * Datamat[randindex]
print (' times:%d h=%s error=%s weights=%s ' % (M * j + I, h, error, weights)) return
weights
Test
if __name__ = = ' __main__ ':
file = ' testSet.txt '
data = np.loadtxt (file, dtype=float, delimiter= ' \ t ', encoding= ' Utf-8 ')
datamat, Labelmat = Np.split (data, (2,), Axis=1)
cloumns = datamat.shape[0]
Datamat = Np.insert ( Datamat, 0, Values=np.ones ((1, Cloumns)), Axis=1)
weights = stocgradascent (Datamat, Labelmat)
print (' coefficients: ', weights)
plotbestfit (Weights, Datamat, Labelmat)
effect
We can see that the stochastic gradient rise algorithm and the gradient rise algorithm are basically consistent except for iterations. It should be noted that alpha values are adjusted at each iteration to prevent data fluctuations or high-frequency fluctuations. The other is that we randomly select the sample selection, which can reduce the cyclical fluctuations. Summary
The above is the contents of logistic regression, through the study of machine learning in recent days, found inside really is still very interesting, the biggest feeling is that the original school to learn those high number, probability theory is really useful ah ...