[Notes] Logistic regression of machine learning

Last Update:2018-07-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Logistic regression is a kind of generalized linear regression, and he is a kind of classified analysis method. Logistic is probably one of the most common classification methods. sigmod Function

In logistic, because the variable is two classified variable, a certain probability as the dependent variable estimate value of the equation takes the range of 0 or 1, so we need a function with this property, so the Sigmod function enters our field of vision.
The prototype of the SIGMOD function is:
When x=0, the value of the SIGMOD function is 0.5, and as the thought increases, the value of the SIGMOD function is approximated to 1, and as the x decreases, the value of the SIGMOD function is approximated to 0. principle

In order to achieve the logistic regression classifier, we can multiply each feature by a regression coefficient, and then add all the result values to the SIGMOD function, so we get a range of values between 0-1. Any data greater than 0.5 is divided into 1 categories, and data less than 0.5 is about to be grouped into 0 categories. Gradient Rise Method

To determine the optimal regression coefficients, we need to use the optimization algorithm, the most commonly used is the gradient rise method. The idea is that to find the maximum value of a function, the best way is to follow the gradient direction of the function. If the gradient is recorded as Grad (x,y) or, the gradient of the function f (x,y) is represented:
Grad (x,y) = =
The gradient operator always points to the fastest-growing direction of the function value. The movement of the gradient is called the step size, which is recorded as a. In vector representations, the iterative formula for the gradient algorithm is as follows: The formula is always iterated until a stop condition is reached, such as specifying the number of iterations or the algorithm reaches an allowable error range. Python code implementation sigmod Function Implementation

#-*-Coding:utf-8-*-
__author__ = ' kestiny '

import numpy as NP
import Matplotlib.pyplot as Plt
import R Andom


def sigmod (InX): Return
    1.0/(1 + np.exp (-inx))

optimal regression coefficients using gradient-rise iterative derivation

def gradascent (Datamat, Labelmat):
    m, n = Np.shape (datamat)
    alpha = 0.001    # Specify gradient's step
    maxcycles = 500
  # Specifies the number of iterations, that is, the iteration termination condition
    weights = Np.ones ((n, 1))  # initializes the regression coefficient to 1 for
    K in range (Maxcycles):
        h = sigmod ( Np.dot (Datamat, Weights))  # matrix multiplication
        error = (labelmat-h)
        weights = weights + Np.dot (Np.dot (Alpha, datamat.t Ranspose ()), error)  # Adjust the regression coefficient in the direction of the error return
    weights

Test

def plotbestfit (Weights, Datamat, labelmat): n = datamat.shape[0] Xcord1 = [] Xcord2 = [] Ycord1 = []
            Ycord2 = [] for i in range (n): If int (labelmat[i]) = = 1:xcord1.append (datamat[i, 1]) Ycord1.append (Datamat[i, 2]) else:xcord2.append (datamat[i, 1]) ycord2.append (Datamat[i, 2 Fig = plt.figure () ax = Fig.add_subplot () ax.scatter (Xcord1, ycord1, s =, c= ' r ', marker= ' s ') ax.s Catter (Xcord2, Ycord2, s=30, c= ' green ') x = Np.arange ( -5.0, 5.0, 0.1) y = (-weights[0]-weights[1] * x)/Weights [2] Plt.plot (x, y) plt.xlabel (' X1 ') Plt.ylabel ("X2") plt.show () if __name__ = = ' __mai n__ ': File = ' TestSet.txt ' data = np.loadtxt (file, dtype=float, delimiter= ' \ t ', encoding= ' utf-8 ') datamat, Labelmat = Np.split (d ATA, (2,), Axis=1) Cloumns = datamat.shape[0] Datamat = Np.insert (Datamat, 0, Values=np.ones ((1, Cloumns)), Axis=1 ) weights = GradasceNT (Datamat, Labelmat) print (' coefficients: ', weights) plotbestfit (weights, Datamat, Labelmat)

effect

random gradient Rise method

The stochastic gradient ascending method solves the awkward problem that the gradient rise algorithm deals with the large amount of data. Since the gradient rise algorithm needs to traverse the entire dataset every time the regression coefficients are updated, the computational complexity of the gradient rise algorithm is too high once there are hundreds of millions of samples and thousands of features. Therefore, we can improve the regression coefficients by using only one sample point at a time or with a certain number of n (10 or 100) samples, which is called the stochastic gradient rise algorithm. Because the classifier can be updated incrementally when the new sample arrives, the stochastic gradient rise algorithm is an online learning algorithm. Python code implementation random gradient rise algorithm

def stocgradascent (Datamat, labelmat,times=100):
    m, n = Np.shape (datamat)
    weights = Np.ones (n) for
    J in Range (times):
        dataindex = Range (m) to
        I in range (m):
            alpha = 4/(1.0 + j + i) + 0.01
            randindex = Int (ra Ndom.uniform (0, Len (dataindex)))
            h = sigmod (Np.sum (Np.dot (datamat[randindex), weights))
            error = labelmat[  Randindex]-h
            weights = weights + Alpha * ERROR * Datamat[randindex]
            print (' times:%d h=%s error=%s weights=%s ' % (M * j + I, h, error, weights)) return
    weights

Test

if __name__ = = ' __main__ ':
    file = ' testSet.txt '
    data = np.loadtxt (file, dtype=float, delimiter= ' \ t ', encoding= ' Utf-8 ')
    datamat, Labelmat = Np.split (data, (2,), Axis=1)
    cloumns = datamat.shape[0]
    Datamat = Np.insert ( Datamat, 0, Values=np.ones ((1, Cloumns)), Axis=1)
    weights = stocgradascent (Datamat, Labelmat)
    print (' coefficients: ', weights)
    plotbestfit (Weights, Datamat, Labelmat)

effect

We can see that the stochastic gradient rise algorithm and the gradient rise algorithm are basically consistent except for iterations. It should be noted that alpha values are adjusted at each iteration to prevent data fluctuations or high-frequency fluctuations. The other is that we randomly select the sample selection, which can reduce the cyclical fluctuations. Summary

The above is the contents of logistic regression, through the study of machine learning in recent days, found inside really is still very interesting, the biggest feeling is that the original school to learn those high number, probability theory is really useful ah ...

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

[Notes] Logistic regression of machine learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

[Notes] Logistic regression of machine learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support