Rookie Note python3--machine learning (ii) logistic regression algorithm

Last Update:2018-05-09 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Resources

<PYTHON_MACHINE_LEARNING> Chapter3
A Tour of the machine learning
Classifers Using Scikit-learn

Introduction

When we classify, the eigenvalues in the sample are generally distributed in the real number field, but what we want is often a similar probability value in [0,1]. Or so, in order for the eigenvalues not to cause interference between the differences between the large, for example, only one feature value is particularly large, but the other values are very small, we need to normalization of the data. That is, we need to use a single injection from R to [0,1] to process the eigenvalue matrix before machine learning. When the mapping used is the sigmoid function, we call this machine learning algorithm called logistic regression.
PS: Logistic regression is used to classify!!! Not for a linear regression! The inverse function of the sigmoid function is called the Logit function, which is the origin of logistic regression, which is not related to logic ...

sigmoid function

This function is characterized by the definition of an S-type domain in R, the function of the range in [0,1]
At the same time it also represents the probability of y=1, the probability of y=0 is 1-phi (z)
Draw a description

#! /usr/bin/python <br> #-*-coding:utf8-*-Import Matplotlib.pyplotAs PltImport NumPyAs NPDefSigmoid(z):Return1.0/(1.0+np.exp (-Z)) z = Np.arange (-10,10,0.1) p = sigmoid (z) plt.plot (z,p)#画一条竖直线, if you do not set the value of x, the default is 0plt.axvline (x=0, Color=' K ') Plt.axhspan (0.0,1.0,facecolor= ' 0.7 ', Alpha=0.4) # draw a horizontal line, If you do not set the value of Y, the default is 0plt.axhline (Y=1, Ls= ' dotted ', Color= ' 0.4 ') plt.axhline (Y=0, Ls= ' dotted ', color=< Span class= "hljs-string" > ' 0.4 ') plt.axhline (Y=0.5, Ls= dotted ', Color= ' K ') Plt.ylim (-0.1,1.1) Span class= "hljs-comment" > #确定y轴的坐标plt. Yticks ([0.0, 0.5, 1.0]) Plt.ylabel ( ' $\phi (z) $ ') Plt.xlabel (  ' z ') ax = PLT.GCA () ax.grid (true) plt.show ()

Logistic regression of logistic regression algorithm

Basic principle
The logistic regression algorithm is similar to the Adaline linear Adaptive algorithm, except that the activation function is changed from the * * constant mapping y = z * * to y = sigmoid (z)

The loss function in logistic regression
Recall the loss function that is applied to the gradient descent model Adaline The sum function squared difference

This is a loss function of linear regression.
But for the sigmoid function of the S-type, this definition is particularly close to 0 when y approaches -1,1.
This is defined for the logistic regression loss function for logistic regression
Logarithmic likelihood loss function (cross entropy)
Ps: All the logs are actually ln .

How did this loss function come about? Maximum Likelihood method
The likelihood function is defined first (each sample is considered independent):

Likelihood function can be regarded as conditional probability
The concept of likelihood function can refer to Kevingao's blog

Http://www.cnblogs.com/kevinGaoblog/archive/2012/03/29/2424346.html

According to the concept of likelihood function, the probability that the maximum likelihood function is the most reasonable. We want to maximize the likelihood function, but this form is still not good enough, after all it is a form of multiplication, so we take the logarithm

Well now, we know: the Power vector WMake LThe biggest time, WThe most reasonable
So we're going to define JFunction: J =-L

For a better understanding, let's look at the loss function of a single sample:

Take Y=1 as an example, when the predicted value is close to the correct value, JWill converge to 0.

Weight update
As with the gradient descent method, the formula

After calculation

We've got the formula for the weight update.
Just like Adaline.
Not surprisingly? Not surprised?

This means that we are writing in separate logisticregressionclass, you only need to AdalineClass to redefine the excitation function phi.

Practice

Let's go to the next chapter Sklearn implement Perceptron Perceptron based on the Iris data set to Practice

__author__ =' Administrator '#! /usr/bin/python <br> #-*-Coding:utf8-*-From SklearnImport datasetsFrom Sklearn.linear_modelImport LogisticregressionFrom Sklearn.cross_validationImport Train_test_splitFrom sklearn.preprocessingImport StandardscalerFrom Sklearn.metricsImport Accuracy_scoreFrom PDCImport Plot_decision_regionsImport Matplotlib.pyplotAs PltFrom Matplotlib.colorsImport ListedcolormapImport NumPyAs Npiris = Datasets.load_iris () x = iris.data[:,[2,3]]y = Iris.targetx_train,x_test,y_train,y_test = Train_test_split (X, y, test_size=0.3, random_state = 0) sc = Standardscaler () sc.fit (X_train) X_ TRAIN_STD = Sc.transform (x_train) x_test_std = Sc.transform (x_test) Ir = Logisticregression (c= 1000.0,random_state=0) Ir.fit (x_train_std,y_train) X_ COMBINED_STD = Np.vstack ((x_train_std,x_test_std)) y_combined = Np.hstack ((y_train,y_test)) plot_decision_regions (X= x_combined_std,y=y_combined, Classifier=ir, Test_idx=range (105, 150)) Plt.xlabel ( ' petal length [standardized] ') Plt.ylabel ( petal width [standardized] ') plt.legend (Loc= ' upper left ') plt.savefig (  ' iris.png ') plt.show () print (X_test_std[0,:]) A = Ir.predict_ Proba (X_test_std[0,:]) print (a)

Over-fitting, under-fitting and regularization

Over-fitting and under-fitting are two common problems in machine learning

Over fitting
Commonly known as thinking too much. For a good fit training set, the model used too many parameters, become particularly complex, even noise and error are divided into a class, such a model although the training set simulation is very good, but for the prediction of the data set is particularly unreliable, we say: This model has a high variance (High variance)
-Less fitting
Corresponding, the mind is too simple. Models are too simple to be reliable for predictive datasets
We have this model with a high bias (higher deviation)

Regularization of the Ruglarization
Regularization is a common method to prevent overfitting. Regularization, simply put, is the introduction of additional deviations to reduce the impact of some extreme weights.
The most common regularization is L2 regularization , which he adds to the end of the loss function

Lambda is called a regularization parameter
This loss function form becomes:

Ir = LogisticRegression(C=1000.0,random_state=0)

The parameter C in the class logisticregression is derived from the related concepts of support vector machine (SVM), which is not unfolded first.

The final form of the loss function:

Effect of C-value on simulation
Set from 5 to 4 10 different powers as C values, let's look at the effects of weights

weights, params = [], []For C inlist (range (-5, 5)): LR = logisticregression (c=10** int (c), Random_state=0) Lr.fit (X_TRAIN_STD, Y_train) Weights.append (Lr.coef_[1]) params.append (10**c) weights = Np.array (weights) plt.plot (params, weights[:, 0],label= petal Length ') Plt.plot (Params,weights[:,1],linestyle= '--', label=  ' petal width ') plt.ylabel ( ' weight coefficient ') Plt.xlabel ( Span class= "hljs-string" > ' C ') plt.legend (Loc= ' upper left ') Plt.xscale (  ' log ') plt.show ()

Ling Yu Live
Links: https://www.jianshu.com/p/9db03938ea72
Source: Pinterest
Copyright belongs to the author. Commercial reprint please contact the author for authorization, non-commercial reprint please specify the source.

Rookie Note python3--machine learning (ii) logistic regression algorithm

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Rookie Note python3--machine learning (ii) logistic regression algorithm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Rookie Note python3--machine learning (ii) logistic regression algorithm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support