R Language Data Analysis series nine
--by Comaple.zhang
In this section, logical regression and R language implementations, logistic regression (lr,logisticregression) is actually a generalized regression model, according to the types of dependent variables and the distribution can be divided into the common multivariate linear regression model, and logistic regression, the logistic regression is that the dependent variable is discrete and the value range is { 0,1} Two classes, if the discrete variable value is a multi-item that becomes MULTI-CLASS classification, so the LR model is a two classification model, can be used to do CTR prediction and so on. So let's get back to the logical regression how to do two classification problems.
Problem Introduction
In multivariate linear regression, our model formulas are like this (refer to the first two sections),
Here f (x,w) is a continuous variable, if our dependent variable is discrete, how to deal with it, for example, we have data like this.
x <-seq ( -3,3,by=0.01)
Y <-1/(1+exp (-X))
GDF <-Data.frame (x=x,y=y)
Ggplot (Gdf,aes (x=x,y=x+0.5)) +geom_line (col= ' green ')
This obviously could not fit our {0,1} output, in order to be able to fit the discrete {0,1} output we introduced the sigmoid function as follows:
Ggplot (Gdf,aes (x=x,y=y)) +geom_line (col= ' blue ') + geom_vline (xintercept=c (0), col= ' red ') + Geom_hline (Yintercept=c ( 0,1), lty=2)
Use R to draw the row of the function as follows:
Again, we can easily convert a linear relationship to a discrete {0,1} output.
Ggplot (Gdf,aes (x=x,y=y)) +geom_line (col= ' blue ') + geom_vline (xintercept=c (0), col= ' red ') + Geom_hline (Yintercept=c ( 0,1), lty=2) +geom_line (Aes (x=x,y=x+0.5), col= ' green ')
So our class probabilities can be expressed as:
Thus our transformation is complete, and the model is finally reduced to the following form:
Loss function of LR (cost functions)
The above leads to the Sigomid function and use it for our model, how to define the loss function, can not do subtraction it, in addition to do subtraction we can also how to do, for the model of discrete variables, we hope to achieve, each classification of the correct number of the more the better, that the model joint probability density maximum:
That is, we want to maximize L (W), in order to optimize the maximum value of L (w), we go to the negative logarithm likelihood of L (W), thereby converting the maximization problem into a minimization problem:
Next, we will optimize the loss function to the smallest of the L (W) groups W.
The optimization method has, Newton method, gradient descent, L-bfgs here no longer detailed in these methods, the rest of the series will be mentioned.
The implementation of LR in R language
We use the iris dataset to perform a logistic regression two classification test, which is a data set from the R language, including my Zodiac, and three classifications. Logistic regression we implemented with the GLM function, which provides various types of regression, such as: Provide normal, exponential, gamma, inverse Gaussian, Poisson, two items. The logistic regression we used was a two-item distribution family binomial.
Index <-which (iris$species = = ' Setosa ')
IR <-iris[-Index,]
Levels (Ir$species) [1] <-"
Split <-sample (100,100* (2/3))
#生成训练集
Ir_train <-Ir[split,]
#生成测试集
Ir_test <-Ir[-split,]
Fit <-GLM (species ~.,family=binomial (link= ' logit '), Data=ir_train)
Summary (FIT)
Real <-Ir_test$species
Data.frame (real,predict)
Predict <-Predict (fit,type= ' response ', newdata=ir_test)
Res <-data.frame (real,predict =ifelse (predict>0.5, ' Virginca ', ' Versicorlor '))
#查看模型效果
Plot (RES)
R Linguistic Data Analysis series nine-Logistic regression