Regression analysis is a statistical method to study the quantitative relationship between variables, which has a wide range of applications.
Logistic regression model Linear regression
Starting with the linear regression model, linear regression is the most basic regression model, which uses a linear function to describe the relationship between two variables and to map continuous or discrete arguments to sequential real fields.
Model Mathematical form:
The introduction of the loss function (loss function, also known as the error functions) describes how well the model fits:
So that J (W) is the smallest, the best parameters are obtained for solving optimization problems.
Logistic regression
Logistic regression (logistic regression or logit regression) is sometimes translated as "logistic regression", but it does not have much to do with "logic" but only transliteration. In terms of content, its most appropriate name should be logit regression.
The logistic regression model is more used in probability classifiers. Linear regression maps an argument to a continuous real number, in many cases the value of the dependent variable is in a finite interval, the most common of which is the 0-1 interval of the probability problem.
The SIGMOD function provides a mapping from a real field to a (0,1):
The function
The way to map a linear model to 0-1 is given in mathematical form:
Inverse transformation:
This transformation, known as the Logit transformation, is perhaps the source of the model's name.
Logistic regression is often used as a probability classifier, with p=0.5 as the decomposition line.
Least squares method for solving programming model
The least squares method is a kind of complete mathematical description, which gives the expression of global optimal solution by mathematical derivation, and the solution formula is given directly.
The least square method can obtain the global optimal solution, but it is difficult to solve because of the inverse of the super-large matrix.
Gradient Descent (Ascent) method:
The gradient descent method is a typical greedy algorithm, which starts from any set of parameters and adjusts the parameters in the direction of minimizing the target function until the target function continues to decline.
In calculus of multivariate functions, the gradient points to a vector where the value of the function changes in the fastest direction. The global optimal solution can not be guaranteed by gradient descent method
The gradient descent method has two methods: Batch gradient descent method and stochastic gradient descent method.
Batch gradient descent (Ascent) method (Batch Gradient descent/ascent)
Algorithm flow of batch gradient descent method:
初始化回归系数为1重复执行直至收敛 { 计算整个数据集的梯度 按照递推公式更新回归梯度}返回最优回归系数值
The loss function J (W) is biased to obtain the gradient of J (W) . Given in matrix form:
Alpha is the descending step, by the iterative formula:
Random gradient descent (ascent) method (stochastic gradient descent/ascent)
Algorithm flow of stochastic gradient descent method:
初始化回归系数为1重复执行直至收敛 { 对每一个训练样本{ 计算样本的梯度 按照递推公式更新回归梯度 }}返回最优回归系数值
To speed up convergence, make two improvements:
(1) At each iteration, adjust the value of the update step Alpha. As the iteration progresses, Alpha gets smaller.
(2) Change the order of samples for each iteration, i.e. randomly select samples to update the regression coefficients
The realization of logistic regression
Training data testSet.txt, containing M-row n+1 columns:
The M-row represents the M-bar data, the first n columns of each data represent n samples, and the N+1 column represents the category label (0 or 1).
Python:
The classifier is encapsulated in the class:
From NumPy import *import Matplotlib.pyplot as Pltdef sigmoid (X): Return 1.0/(1+exp (x)) class Logregressclassifier (o bject): def __init__ (self): Self.datamat = List () Self.labelmat = List () self.weights = List () def loaddataset (self, filename): FR = open (filename) for line in Fr.readlines (): Linearr = Li Ne.strip (). Split () Dataline = [1.0] for I in LineArr:dataLine.append (float (i)) label = Dataline.pop () # pop the last column referring to label Self.dataMat.append (Dataline) Self.labelMat.append (int (label)) Self.datamat = Mat (self.datamat) Self.labelmat = Mat (Self.labelmat). TRANSP OSE () def train (self): Self.weights = Self.stocgradascent1 () def batchgradascent (self): M,n = Shap E (self.datamat) alpha = 0.001 maxcycles = weights = Ones ((n,1)) for K in range (Maxcycles): #heavy onMatrix operations H = sigmoid (Self.datamat * weights) #matrix mult error = (self.labelmat-h) #vector subtraction Weights + = Alpha * Self.dataMat.transpose () * ERROR #matrix mult return Weights def stocGradAscent1 (self): M,n = shape (self.datamat) Alpha = 0.01 weights = Ones ((n,1)) #initialize to all ones for I in range (m): h = sigmoid (sum (self.datamat[i] * weights)) error = Self.labelmat[i]-H weights + = (Alpha * ERROR * Self.datamat[i]). Transpose () return weights Def St OcGradAscent2 (self): Numiter = 2 M,n = shape (self.datamat) weights = Ones ((n,1)) #initialize to Al L ones for J in Range (Numiter): Dataindex = Range (m) for I in Range (m): Alpha = 4/(1.0+j+i) +0.0001 #apha decreases with iteration, does not randindex = Int (Random.uniform (0,len (data Index)) #go to 0 because of the constant h = sigmoid (sum (self.datamat[randindex] * weights)) error = Self.labelmat [Randindex]-H weights + = (Alpha * ERROR * Self.datamat[randindex]). Transpose () del (dataind Ex[randindex]) return weights def classify (self, X): prob = sigmoid (SUM (X * self.weights)) if pro B > 0.5:return 1.0 else:return 0.0 def test (self): Self.loaddataset (' TestData . dat ') Weights0 = self.batchgradascent () Weights1 = Self.stocgradascent1 () weights2 = Self.stocgradas Cent2 () print (' Batchgradascent: ', weights0) print (' stocGradAscent0: ', weights1) print (' Stocgradascent 1: ', weights2) if __name__ = = ' __main__ ': lr = Logregressclassifier () lr.test ()
Matlab
The above Python code is not difficult to implement with MATLAB (just need to remove the class package), but Matlab's Generalized linear Model Toolbox provides the implementation of the logistic model.
trainData = [0 1; -1 0; 2 2; 3 3; -2 -1;-4.5 -4; 2 -1; -1 -3];group = [1 1 0 0 1 1 0 0]‘;testData = [5 2;3 1;-4 -3];[testNum, attrNum] = size(testData);testData2 = [ones(testNum,1), testData];B = glmfit(trainData, [group ones(size(group))],‘binomial‘, ‘link‘, ‘logit‘)p = 1.0 ./ (1 + exp(- testData2 * B))
B = glmfit(X, [Y N],‘binomial‘, ‘link‘, ‘logit‘)
The x parameter is a feature line vector group, y is a pre-grouped column vector, N is a vector with the Y-type, and Y (i) is evaluated in the [0 N (i)] range.
b for [1, X1, x2,...] Coefficient, the first column of the test data is added with 1.
p = 1.0 ./ (1 + exp(- testData2 * B))
The sigmoid function is solved by substituting.
Logistic regression model and Python implementation