Logical regression of machine learning

Source: Internet
Author: User

Logical regression of machine learning

In the previous chapter, we learned about general linear regression, and now let's take a look at what the hell is logistic regression?

In fact, from this point of view, I think that the logistic regression is not a return, but directly belong to a classification problem, but the classification, why also called regression, because our logistic regression function is used:

If the prediction is greater than 0.5, the judgment is positive, otherwise negative, for regression, our essential function is the loss function, the general linear regression function is as follows:

and the loss function of the logistic regression is this:

This is not a convex function, so it is difficult to find an optimal solution, when mathematicians play a role, and they use their wisdom to study the following model:

This is the case:

Let's take a look at what the logistic regression really is.

First read in the data:

import matplotlib.pyplot as pltfrom scipy.optimize import minimizedef loaddata(file, delimeter):    data = np.loadtxt(file, delimiter=delimeter)    print(‘Dimensions: ‘,data.shape)    print(data[1:6,:])    return(data)

We also need to print out the dots:

def plotData(data, label_x, label_y, label_pos, label_neg, axes=None):    neg = data[:,2] == 0    pos = data[:,2] == 1    if axes == None:        axes = plt.gca()        axes.scatter(data[pos][:,0], data[pos][:,1], marker=‘+‘, c=‘k‘, s=60, linewidth=2, label=label_pos)        axes.scatter(data[neg][:,0], data[neg][:,1], c=‘y‘, s=60, label=label_neg)        axes.set_xlabel(label_x)        axes.set_ylabel(label_y)        axes.legend(frameon= True, fancybox = True);

Let's read the data:

data = loaddata(‘data1.txt‘,‘,‘)

Let's see what the data looks like.

X = np.c_[np.ones((data.shape[0],1)), data[:,0:2]]y = np.c_[data[:,2]]plotData(data, ‘Exam 1 score‘, ‘Exam 2 score‘, ‘Pass‘, ‘Fail‘)

So our job is to find a dividing point. It's good to separate the two types of data.

Defining Logistic regression
def sigmoid(z):    return(1 / (1 + np.exp(-z)))

The definition of the loss function is this:

def costFunction(theta,X,y):    m = y.size    h = sigmoid(X.dot(theta))    J = -1.0*(1.0/m)*(np.log(h).T.dot(y)+np.log(1-h).T.dot(1-y))    if np.isnan(J[0]):        return np.inf    return J[0]

Let's use the chain derivation rule to take the derivative of this loss function:

So we have the derivative function:

def gradient(theta,X,y):    m = y.size    h = sigmoid(X.dot(theta.reshape(-1,1)))    grad = (1.0/m)*X.T.dot(h-y)    return grad.flatten()

Let's take a look at the error and gradient of the initial value:

initial_theta = np.zeros(X.shape[1])cost = costFunction(initial_theta,X,y) #0.69314718055994529grad = gradient(initial_theta,X,y)     #[ -0.1 , -12.00921659, -11.26284221]

So we're going to minimize this function:

res = minimize(costFunction, initial_theta, args=(X,y), jac=gradient, options={‘maxiter‘:400})  #这里使用了scipy中的库,不懂的同学可以看看官方文档,很简单的。

That's what our predictive function is like:

def predict(theta, X, threshold=0.5):    p = sigmoid(X.dot(theta.T)) >= threshold    return(p.astype(‘int‘))

We draw the previous data set to the boundary:

plt.scatter(45, 85, s=60, c=‘r‘, marker=‘v‘, label=‘(45, 85)‘)plotData(data, ‘Exam 1 score‘, ‘Exam 2 score‘, ‘Admitted‘, ‘Not admitted‘)x1_min, x1_max = X[:,1].min(), X[:,1].max(),x2_min, x2_max = X[:,2].min(), X[:,2].max(),xx1, xx2 = np.meshgrid(np.linspace(x1_min, x1_max), np.linspace(x2_min, x2_max))h = sigmoid(np.c_[np.ones((xx1.ravel().shape[0],1)), xx1.ravel(), xx2.ravel()].dot(res.x))h = h.reshape(xx1.shape)plt.contour(xx1, xx2, h, [0.5], linewidths=1, colors=‘b‘);

Logistic regression with regularization

Before we learned the general logistic regression without the regularization parameters, let's look at how regularization is done.

data2 = loaddata(‘data2.txt‘,‘,‘)X = data2[:,0:2]y = np.c_[data2[:,2]]plotData(data2, ‘Microchip Test 1‘, ‘Microchip Test 2‘, ‘y = 1‘, ‘y = 0‘)plt.show()

We need to find a boundary to classify these points nicely. As you can see, the points are not strictly separate from each other, we define a higher-order function to fit the data set ourselves.

from sklearn.preprocessing import PolynomialFeaturespoly = PolynomialFeatures(6)XX = poly.fit_transform(data2[:,0:2])

def costFunctionReg(theta, reg, *args):    m = y.size    h = sigmoid(XX.dot(theta))    J = -1.0*(1.0/m)*(np.log(h).T.dot(y)+np.log(1-h).T.dot(1-y)) + (reg/(2.0*m))*np.sum(np.square(theta[1:]))    if np.isnan(J[0]):        return(np.inf)    return(J[0])

def gradientReg(theta, reg, *args):    m = y.size    h = sigmoid(XX.dot(theta.reshape(-1,1)))    grad = (1.0/m)*XX.T.dot(h-y) + (reg/m)*np.r_[[[0]],theta[1:].reshape(-1,1)]    return(grad.flatten())

Let's look at the effect of the different regularization coefficients on the results:

Initial_theta = Np.zeros (xx.shape[1]) Costfunctionreg (Initial_theta, 1, XX, y) FIG, axes = plt.subplots (1,3, Sharey = True,  Figsize= (17,5) # decision boundary, let's take a look at the regularization coefficients. Lambda is too big or too small to see what happens # lambda = 0: There is no regularization, so it's over fit # lambda = 1: This is the right way to open # lambda = 100: The regularization item is too aggressive, causing the basic to not fit the decision boundary for I, C in enumerate ([0.0, 1.0, 100.0]): # optimization Costfunctionreg res2 = Minimize (costfunc Tionreg, Initial_theta, args= (C, XX, y), Jac=gradientreg, options={' Maxiter ': 3000}) # Accuracy accuracy = 100.0*sum (Predi  CT (res2.x, XX) = = Y.ravel ())/y.size # Hash plot for x, y PlotData (data2, ' Microchip Test 1 ', ' Microchip Test 2 ', ' y = 1 ',  ' y = 0 ', Axes.flatten () [i]) # Draw decision Bounds X1_min, X1_max = X[:,0].min (), X[:,0].max (), x2_min, X2_max = X[:,1].min (), X[:,1].max (), xx1, xx2 = Np.meshgrid (Np.linspace (X1_min, X1_max), Np.linspace (X2_min, X2_max)) H = sigmoid (poly.fit _transform (Np.c_[xx1.ravel (), Xx2.ravel ()). dot (res2.x)) H = H.reshape (Xx1.shape) Axes.flatten () [I].contour (Xx1, xx2 , h, [0.5], Linewidths=1, Colors= ' G '); Axes.flatten () [I].set_title (' Train accuracy {}% with LAMBDA = {} '. Format (np.round (accuracy, decimals=2), C))

Here, the return has been studied, in the next learning stage, if there is no fault, please spit groove!

Logical regression of machine learning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.