5 Logistic regression (two)

Last Update:2016-02-17 Source: Internet

Author: User

Tags for in range

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

5.2.4 Training algorithm: Random gradient rise

Gradient ascent algorithm: The entire data set needs to be traversed each time the regression coefficients are updated, and the algorithm is too complex on billions of samples.

Improved method: random gradient ascent algorithm : The regression coefficients are updated with only one sample point at a time.

Because the classifier can be incrementally updated when the new sample arrives, the random gradient rise algorithm is an on- line learning algorithm. As opposed to online learning, processing all data at once is called batching.

# 5-3: Random gradient rise algorithm def stocGradAscent0 (Datamatrix, classlabels):     = shape (datamatrix)    = 0.01    = ones (n)     for in Range (m):        = sigmoid (SUM (datamatrix[i] * weights))        = classlabels[i]- h        = Weights + Alpha * ERROR * datamatrix[i]    return weights

The difference between the random gradient rise and the gradient rise : 1. The former variable h and error are both numeric, the latter are vectors; 2. The former does not have a matrix conversion process, and all variable data types are numpy arrays.

The fitting effect is perfect without gradient ascent algorithm. The classifier here is incorrectly divided by One-third samples. The result of the gradient ascent algorithm is to iterate over the entire data set 500 times before it is obtained.

A reliable method to determine the merits of an optimization algorithm: to see if it converges, that is, whether the parameter reaches a stable value, and whether it changes continuously.

In this respect, the random gradient rise in program 5-3 is modified to make it run 200 times over the entire data set. The final three regression coefficients are plotted as follows:

Figure 5-6

X2 have reached a stable value of 50 iterations, but X0 and X1 need more iterations. This behavior occurs because there are some sample points that are not properly categorized (datasets are non-linear), which can cause drastic changes in coefficients at each iteration. We want the algorithm to avoid moving back and forth, thus converging to a certain value. In addition, the convergence speed also needs to be accelerated.

#5-4: Improved random gradient ascent algorithmdefStocGradAscent1 (Datamatrix, classlabels, Numiter = 150): M, n=shape (datamatrix) Weights=ones (n) forJinchRange (Numiter):#J Iteration CountDataindex =Range (m) forIinchRange (m):#I sample point subscriptAlpha = 4/(1.0 + j + i) + 0.01#Alpha needs to be adjusted at each iterationrandindex = Int (random.uniform (0, Len (dataindex)))#randindex Number: The position of the sample in the Matrixh = sigmoid (sum (datamatrix[randindex] *weights)) Error= Classlabels[randindex]-h Weights= weights + Alpha * ERROR *Datamatrix[randindex]del(Dataindex[randindex])returnWeights

Improvement: 1.alpha Each iteration needs to be adjusted to mitigate data fluctuations or high-frequency fluctuations. Although Alpha decreases with the number of iterations, it is never less than 0 because there is a constant entry (0.01). The reason for this is that the new data will still have a certain impact after multiple iterations. Alpha decreases 1/(j+i) Each time, J is the number of iterations, and I is the sample point subscript. When J<<max (i), alpha is not strictly degraded. 2. The regression coefficients are updated by randomly selecting the samples to reduce the periodic fluctuation. The implementation method is similar to the 3rd chapter, where you randomly select a value from the list and then delete the value from the list (and then the next iteration).

Figure 5-7

This method is faster than the fixed alpha convergence rate. Mainly due to: 1.stocgradascent1 () sample stochastic mechanism to avoid periodic fluctuations; 2.stocgradascent1 () converges faster. This time only 20 traversal of the data set was done, and the previous method was 500 times.

5.3 Example: predicting mortality from hernia disease of the horse

(1) Collect data

(2) Prepare the data

(3) Analysis data

(4) Training algorithm: Use optimization algorithm to find the best coefficient

(5) test algorithm: In order to quantify the effect of regression, we need to observe the error rate. Depending on the error rate, we decide whether to fall back to the training stage and get better regression coefficients by changing the number of iterations and the step length.

(6) using the algorithm

5.3.1 Preparing data: Handling missing values in the data

Preprocessing requires 2 things to do:

1. Missing values must be replaced with a real value because the NumPy type does not allow the inclusion of missing values. Choose 0 here to replace all missing values, just for logistic regression. The reason for this is that you need a value that does not affect the coefficients when updating.

2. If the category label in the dataset is missing, discard the data.

5.3.2 Test algorithm: Classification with logistic regression

What you need to do with logistic regression: Multiply each feature vector on the test set by the regression coefficients obtained by the optimization method, sum the product results, and then enter into the sigmoid function. If the corresponding sigmoid value is greater than 0.5, the forecast category label is 1, otherwise 0.

#5-5:logistic regression classification functiondefClassifyvector (InX, weights):#(eigenvectors, regression coefficients)Prob = sigmoid (SUM (InX *weights)) ifProb > 0.5:return1.0Else:return0.0defColictest ():#open a test set, training setFrtrain = open ('HorseColicTraining.txt') Frtest= Open ('HorseColicTest.txt') Trainingset= []; Traininglabels = []     forLineinchfrtrain.readlines (): Currline= Line.strip (). Split ('\ t') Linearr= []         forIinchRange (21):#0-20:20 features, 1 class labelslinearr.append (float (currline[i])) trainingset.append (Linearr) traininglabels.append (float (c urrline[21st])) trainweights= StocGradAscent1 (Array (trainingset), Traininglabels, 500)#Calculate regression coefficientsErrorcount= 0; Numtestvec = 0.0 forLineinchFrtest.readlines ():#Import test set, calculate classification error rateNumtestvec + = 1.0Currline= Line.strip (). Split ('\ t') Linearr= []         forIinchRange (21): Linearr.append (float (currline[i)))ifInt (classifyvector (Array (Linearr), trainweights))! = Int (currline[21]): Errorcount+ = 1errorrate= Float (errorcount)/NumtestvecPrint "The error rate of this test is:%f"%errorratereturnerrorratedefMultitest ():#call Colictest () 10 times averagingNumtests = 10; Errorsum = 0.0 forKinchRange (numtests): Errorsum+=colictest ()Print "After %d iterations the average error rate is:%f"% (Numtests, errorsum/float (numtests))

5.4 Summary

The purpose of logistic regression is to find the best fitting parameters of a nonlinear function sigmoid, and the solution process can be accomplished by the optimization algorithm. In the optimization algorithm, the gradient ascending algorithm is the most common one, and the gradient ascending algorithm can be simplified as the random gradient ascending algorithm.

The random gradient ascending algorithm and the gradient ascending algorithm have the same effect, but occupy less computing resources. In addition, the random gradient is an on-line algorithm that can be updated when the data arrives, without having to reread the entire data set for batch operations.

5 Logistic regression (two)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More