Logic regression analysis of R language

Source: Internet
Author: User

In theory, regression analysis is modeled in the case where the target variable is continuous data, and it cannot handle the situation where the target variable is classified data.

Logic regression analysis of the idea is to classify variables ("open VIP") into a continuous variable ("Open VIP probability"), and then use the method of regression analysis to indirectly study the problem of classification analysis.

First, the principle

Assuming that the VIP variable is a categorical variable, it takes a value of only 0 and 1, which is a type variable that cannot be modeled by regression analysis.

However, the probability of a VIP value of 1 is a continuous variable (PROB.VIP), which can be modeled using regression analysis for PROB.VIP:

Prob.vip=k1*x1+k2*x2+k3*x3+k4*x4+b

Since the value range of the k1*x1+k2*x2+k3*x3+k4*x4+b is (-∞,+∞), and the PROB.VIP range is [0,1], the conversion is performed using the y=1/(-X) function:

PROB.VIP=1/(1+exp (-(k1*x1+k2*x2+k3*x3+k4*x4+b)))

When prob.vip>0.5, the vip.predict=1 can be predicted, otherwise 0.

Note: Regression analysis uses least squares to fit model parameters, and logic regression uses the maximum likelihood method to estimate.

Ii. implementation of the R language

GLM () is the core function for logic regression analysis using R language.

Parameters:

Formula: Setting the form of a linear fit model

FAMILY:GLM's algorithm family. Logic regression analysis, family set to binomial ("logit")

Data: Samples

Code:

(1) Building a logic regression model
DATA.GLM<- glm (VIP~., Data=vip.data,family=binomial ("logit")) Summary ( DATA.GLM) The model can be modified using the step function: DATA.GLM<- Step (DATA.GLM)

(2) Output items of model GLM
Model parameters: Data.glm$coefficients
Predictive data for linear models: Data.glm$linear.predictors
The VIP equals 1 probability prob.vip:data.glm$fitted.values
Residuals for linear fit models: Data.glm$residuals

(3) Model prediction
Predictive test data:
PREDICT.VIP <-IfElse (data.glm$fitted.values>= 0.5,1,0)
PREDICT.VIP <-As.factor (PREDICT.VIP)

Predict new data:
new.predict.vip<-Predict (Data.glm,newdata=test.vip.data) predictive values for linear fit data
new.predict.vip<-1/(1+exp (-NEW.PREDICT.VIP)) probability value
new.predict.vip<-As.factor (IfElse (new.predict.vip>= 0.5,1,0)) Predict final value

(4) Model performance measurement
performance<-Length (which ((PREDICT.VIP==VIP.DATA$VIP) ==true))/nrow (vip.data) correct rate
where length (which (PREDICT.VIP==VIP.DATA$VIP) ==true) represents the number of values that the predicted value is equal to the actual sample value element.

Logic regression analysis of R language

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.