Logical regression of machine learning

Last Update:2016-05-12 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In the previous chapter, we learned about general linear regression, and now let's take a look at what the hell is logistic regression?

In fact, from this point of view, I think that the logistic regression is not a return, but directly belong to a classification problem, but the classification, why also called regression, because our logistic regression function is used:

If the prediction is greater than 0.5, the judgment is positive, otherwise negative, for regression, our essential function is the loss function, the general linear regression function is as follows:

and the loss function of the logistic regression is this:

This is not a convex function, so it is difficult to find an optimal solution, when mathematicians play a role, and they use their wisdom to study the following model:

This is the case:

Let's take a look at what the logistic regression really is.

First read in the data:

import matplotlib.pyplot as pltfrom scipy.optimize import minimizedef loaddata(file, delimeter):    data = np.loadtxt(file, delimiter=delimeter)    print(‘Dimensions: ‘,data.shape)    print(data[1:6,:])    return(data)

We also need to print out the dots:

def plotData(data, label_x, label_y, label_pos, label_neg, axes=None):    neg = data[:,2] == 0    pos = data[:,2] == 1    if axes == None:        axes = plt.gca()        axes.scatter(data[pos][:,0], data[pos][:,1], marker=‘+‘, c=‘k‘, s=60, linewidth=2, label=label_pos)        axes.scatter(data[neg][:,0], data[neg][:,1], c=‘y‘, s=60, label=label_neg)        axes.set_xlabel(label_x)        axes.set_ylabel(label_y)        axes.legend(frameon= True, fancybox = True);

Let's read the data:

data = loaddata(‘data1.txt‘,‘,‘)

Let's see what the data looks like.

X = np.c_[np.ones((data.shape[0],1)), data[:,0:2]]y = np.c_[data[:,2]]plotData(data, ‘Exam 1 score‘, ‘Exam 2 score‘, ‘Pass‘, ‘Fail‘)

So our job is to find a dividing point. It's good to separate the two types of data.

Defining Logistic regression

def sigmoid(z):    return(1 / (1 + np.exp(-z)))

The definition of the loss function is this:

def costFunction(theta,X,y):    m = y.size    h = sigmoid(X.dot(theta))    J = -1.0*(1.0/m)*(np.log(h).T.dot(y)+np.log(1-h).T.dot(1-y))    if np.isnan(J[0]):        return np.inf    return J[0]

Let's use the chain derivation rule to take the derivative of this loss function:

So we have the derivative function:

def gradient(theta,X,y):    m = y.size    h = sigmoid(X.dot(theta.reshape(-1,1)))    grad = (1.0/m)*X.T.dot(h-y)    return grad.flatten()

Let's take a look at the error and gradient of the initial value:

initial_theta = np.zeros(X.shape[1])cost = costFunction(initial_theta,X,y) #0.69314718055994529grad = gradient(initial_theta,X,y)     #[ -0.1 , -12.00921659, -11.26284221]

So we're going to minimize this function:

res = minimize(costFunction, initial_theta, args=(X,y), jac=gradient, options={‘maxiter‘:400})  #这里使用了scipy中的库，不懂的同学可以看看官方文档，很简单的。

That's what our predictive function is like:

def predict(theta, X, threshold=0.5):    p = sigmoid(X.dot(theta.T)) >= threshold    return(p.astype(‘int‘))

We draw the previous data set to the boundary:

plt.scatter(45, 85, s=60, c=‘r‘, marker=‘v‘, label=‘(45, 85)‘)plotData(data, ‘Exam 1 score‘, ‘Exam 2 score‘, ‘Admitted‘, ‘Not admitted‘)x1_min, x1_max = X[:,1].min(), X[:,1].max(),x2_min, x2_max = X[:,2].min(), X[:,2].max(),xx1, xx2 = np.meshgrid(np.linspace(x1_min, x1_max), np.linspace(x2_min, x2_max))h = sigmoid(np.c_[np.ones((xx1.ravel().shape[0],1)), xx1.ravel(), xx2.ravel()].dot(res.x))h = h.reshape(xx1.shape)plt.contour(xx1, xx2, h, [0.5], linewidths=1, colors=‘b‘);

Logistic regression with regularization

Before we learned the general logistic regression without the regularization parameters, let's look at how regularization is done.

data2 = loaddata(‘data2.txt‘,‘,‘)X = data2[:,0:2]y = np.c_[data2[:,2]]plotData(data2, ‘Microchip Test 1‘, ‘Microchip Test 2‘, ‘y = 1‘, ‘y = 0‘)plt.show()

We need to find a boundary to classify these points nicely. As you can see, the points are not strictly separate from each other, we define a higher-order function to fit the data set ourselves.

from sklearn.preprocessing import PolynomialFeaturespoly = PolynomialFeatures(6)XX = poly.fit_transform(data2[:,0:2])

def costFunctionReg(theta, reg, *args):    m = y.size    h = sigmoid(XX.dot(theta))    J = -1.0*(1.0/m)*(np.log(h).T.dot(y)+np.log(1-h).T.dot(1-y)) + (reg/(2.0*m))*np.sum(np.square(theta[1:]))    if np.isnan(J[0]):        return(np.inf)    return(J[0])

def gradientReg(theta, reg, *args):    m = y.size    h = sigmoid(XX.dot(theta.reshape(-1,1)))    grad = (1.0/m)*XX.T.dot(h-y) + (reg/m)*np.r_[[[0]],theta[1:].reshape(-1,1)]    return(grad.flatten())

Let's look at the effect of the different regularization coefficients on the results:

Initial_theta = Np.zeros (xx.shape[1]) Costfunctionreg (Initial_theta, 1, XX, y) FIG, axes = plt.subplots (1,3, Sharey = True,  Figsize= (17,5) # decision boundary, let's take a look at the regularization coefficients. Lambda is too big or too small to see what happens # lambda = 0: There is no regularization, so it's over fit # lambda = 1: This is the right way to open # lambda = 100: The regularization item is too aggressive, causing the basic to not fit the decision boundary for I, C in enumerate ([0.0, 1.0, 100.0]): # optimization Costfunctionreg res2 = Minimize (costfunc Tionreg, Initial_theta, args= (C, XX, y), Jac=gradientreg, options={' Maxiter ': 3000}) # Accuracy accuracy = 100.0*sum (Predi  CT (res2.x, XX) = = Y.ravel ())/y.size # Hash plot for x, y PlotData (data2, ' Microchip Test 1 ', ' Microchip Test 2 ', ' y = 1 ',  ' y = 0 ', Axes.flatten () [i]) # Draw decision Bounds X1_min, X1_max = X[:,0].min (), X[:,0].max (), x2_min, X2_max = X[:,1].min (), X[:,1].max (), xx1, xx2 = Np.meshgrid (Np.linspace (X1_min, X1_max), Np.linspace (X2_min, X2_max)) H = sigmoid (poly.fit _transform (Np.c_[xx1.ravel (), Xx2.ravel ()). dot (res2.x)) H = H.reshape (Xx1.shape) Axes.flatten () [I].contour (Xx1, xx2 , h, [0.5], Linewidths=1, Colors= ' G '); Axes.flatten () [I].set_title (' Train accuracy {}% with LAMBDA = {} '. Format (np.round (accuracy, decimals=2), C))

Here, the return has been studied, in the next learning stage, if there is no fault, please spit groove!

Logical regression of machine learning

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Logical regression of machine learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Logical regression of machine learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support