Rookie Note python3--machine learning (ii) logistic regression algorithm

Source: Internet
Author: User

Resources

<PYTHON_MACHINE_LEARNING> Chapter3
A Tour of the machine learning
Classifers Using Scikit-learn

Introduction

When we classify, the eigenvalues in the sample are generally distributed in the real number field, but what we want is often a similar probability value in [0,1]. Or so, in order for the eigenvalues not to cause interference between the differences between the large, for example, only one feature value is particularly large, but the other values are very small, we need to normalization of the data. That is, we need to use a single injection from R to [0,1] to process the eigenvalue matrix before machine learning. When the mapping used is the sigmoid function, we call this machine learning algorithm called logistic regression.
PS: Logistic regression is used to classify!!! Not for a linear regression! The inverse function of the sigmoid function is called the Logit function, which is the origin of logistic regression, which is not related to logic ...

sigmoid function

This function is characterized by the definition of an S-type domain in R, the function of the range in [0,1]
At the same time it also represents the probability of y=1, the probability of y=0 is 1-phi (z)
Draw a description

#! /usr/bin/python <br> #-*-coding:utf8-*-Import Matplotlib.pyplotAs PltImport NumPyAs NPDefSigmoid(z):Return1.0/(1.0+np.exp (-Z)) z = Np.arange (-10,10,0.1) p = sigmoid (z) plt.plot (z,p)#画一条竖直线, if you do not set the value of x, the default is 0plt.axvline (x=0, Color=' K ') Plt.axhspan (0.0,1.0,facecolor= ' 0.7 ', Alpha=0.4) # draw a horizontal line, If you do not set the value of Y, the default is 0plt.axhline (Y=1, Ls= ' dotted ', Color= ' 0.4 ') plt.axhline (Y=0, Ls= ' dotted ', color=< Span class= "hljs-string" > ' 0.4 ') plt.axhline (Y=0.5, Ls= dotted ', Color= ' K ') Plt.ylim (-0.1,1.1) Span class= "hljs-comment" > #确定y轴的坐标plt. Yticks ([0.0, 0.5, 1.0]) Plt.ylabel ( ' $\phi (z) $ ') Plt.xlabel (  ' z ') ax = PLT.GCA () ax.grid (true) plt.show ()    
Logistic regression of logistic regression algorithm
    • Basic principle
      The logistic regression algorithm is similar to the Adaline linear Adaptive algorithm, except that the activation function is changed from the * * constant mapping y = z * * to y = sigmoid (z)
    • The loss function in logistic regression
      Recall the loss function that is applied to the gradient descent model Adaline The sum function squared difference

      This is a loss function of linear regression.
      But for the sigmoid function of the S-type, this definition is particularly close to 0 when y approaches -1,1.
      This is defined for the logistic regression loss function for logistic regression
      Logarithmic likelihood loss function (cross entropy)
      Ps: All the logs are actually ln .


How did this loss function come about? Maximum Likelihood method
The likelihood function is defined first (each sample is considered independent):

Likelihood function can be regarded as conditional probability
The concept of likelihood function can refer to Kevingao's blog

Http://www.cnblogs.com/kevinGaoblog/archive/2012/03/29/2424346.html

According to the concept of likelihood function, the probability that the maximum likelihood function is the most reasonable. We want to maximize the likelihood function, but this form is still not good enough, after all it is a form of multiplication, so we take the logarithm


Well now, we know: the Power vector WMake LThe biggest time, WThe most reasonable
So we're going to define JFunction: J =-L

For a better understanding, let's look at the loss function of a single sample:


Take Y=1 as an example, when the predicted value is close to the correct value, JWill converge to 0.

    • Weight update
      As with the gradient descent method, the formula

After calculation


We've got the formula for the weight update.
Just like Adaline.
Not surprisingly? Not surprised?


This means that we are writing in separate logisticregressionclass, you only need to AdalineClass to redefine the excitation function phi.

Practice

Let's go to the next chapter Sklearn implement Perceptron Perceptron based on the Iris data set to Practice

__author__ =' Administrator '#! /usr/bin/python <br> #-*-Coding:utf8-*-From SklearnImport datasetsFrom Sklearn.linear_modelImport LogisticregressionFrom Sklearn.cross_validationImport Train_test_splitFrom sklearn.preprocessingImport StandardscalerFrom Sklearn.metricsImport Accuracy_scoreFrom PDCImport Plot_decision_regionsImport Matplotlib.pyplotAs PltFrom Matplotlib.colorsImport ListedcolormapImport NumPyAs Npiris = Datasets.load_iris () x = iris.data[:,[2,3]]y = Iris.targetx_train,x_test,y_train,y_test = Train_test_split (X, y, test_size=0.3, random_state = 0) sc = Standardscaler () sc.fit (X_train) X_ TRAIN_STD = Sc.transform (x_train) x_test_std = Sc.transform (x_test) Ir = Logisticregression (c= 1000.0,random_state=0) Ir.fit (x_train_std,y_train) X_ COMBINED_STD = Np.vstack ((x_train_std,x_test_std)) y_combined = Np.hstack ((y_train,y_test)) plot_decision_regions (X= x_combined_std,y=y_combined, Classifier=ir, Test_idx=range (105, 150)) Plt.xlabel ( ' petal length [standardized] ') Plt.ylabel ( petal width [standardized] ') plt.legend (Loc= ' upper left ') plt.savefig (  ' iris.png ') plt.show () print (X_test_std[0,:]) A = Ir.predict_ Proba (X_test_std[0,:]) print (a)        
Over-fitting, under-fitting and regularization

Over-fitting and under-fitting are two common problems in machine learning

    • Over fitting
      Commonly known as thinking too much. For a good fit training set, the model used too many parameters, become particularly complex, even noise and error are divided into a class, such a model although the training set simulation is very good, but for the prediction of the data set is particularly unreliable, we say: This model has a high variance (High variance)
      -Less fitting
      Corresponding, the mind is too simple. Models are too simple to be reliable for predictive datasets
      We have this model with a high bias (higher deviation)
    • Regularization of the Ruglarization
      Regularization is a common method to prevent overfitting. Regularization, simply put, is the introduction of additional deviations to reduce the impact of some extreme weights.
      The most common regularization is L2 regularization , which he adds to the end of the loss function

      Lambda is called a regularization parameter
      This loss function form becomes:
Ir = LogisticRegression(C=1000.0,random_state=0)

The parameter C in the class logisticregression is derived from the related concepts of support vector machine (SVM), which is not unfolded first.


The final form of the loss function:

    • Effect of C-value on simulation
      Set from 5 to 4 10 different powers as C values, let's look at the effects of weights
weights, params = [], []For C inlist (range (-5, 5)): LR = logisticregression (c=10** int (c), Random_state=0) Lr.fit (X_TRAIN_STD, Y_train) Weights.append (Lr.coef_[1]) params.append (10**c) weights = Np.array (weights) plt.plot (params, weights[:, 0],label= petal Length ') Plt.plot (Params,weights[:,1],linestyle= '--', label=  ' petal width ') plt.ylabel ( ' weight coefficient ') Plt.xlabel ( Span class= "hljs-string" > ' C ') plt.legend (Loc= ' upper left ') Plt.xscale (  ' log ') plt.show ()          


Ling Yu Live
Links: https://www.jianshu.com/p/9db03938ea72
Source: Pinterest
Copyright belongs to the author. Commercial reprint please contact the author for authorization, non-commercial reprint please specify the source.

Rookie Note python3--machine learning (ii) logistic regression algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.