How to implement common machine learning algorithms with Python-1

Source: Internet
Author: User
Tags svm

Recently learned about Python implementation of common machine learning algorithms on GitHub

Directory

    • First, linear regression

1. Cost function

2. Gradient Descent algorithm

3. Normalization of the mean value

4. Final running result

5, using the linear model in the Scikit-learn library to implement

    • Second, logistic regression

1. Cost function

2. Gradient

3. Regularization

4, S-type function

5. Mapping to polynomial

6, the use of the optimization method

7. Operation Result

8, using the logical regression model in Scikit-learn library to implement

    • Logistic regression _ Handwritten numeral recognition _onevsall

1, randomly display 100 numbers

2, Onevsall

3. Handwritten digit recognition

4. Forecast

5. Operation Result

6, using the logical regression model in Scikit-learn library to implement

    • Three, BP neural network

1. Neural network model

2. Cost function

3. Regularization

4. Reverse propagation of BP

5, BP can seek the reason of the gradient

6. Gradient Check

7. Random initialization of weights

8. Forecast

9. Output results

    • Iv. SVM Support Vector machine

1. Cost function

2, Large Margin

3. SVM Kernel

4. Model code in use

5. Operation Result

    • Five, k-mearns Clustering algorithm

1. Clustering process

2. Objective function

3, the choice of cluster center

4, the choice of the number of clusters K

5. Application-Image compression

6, using the linear model in the Scikit-learn library to achieve clustering

7. Operation Result

    • VI. PCA principal Component Analysis (dimensionality reduction)

1. Usefulness

2, 2D-->1D,ND-->KD

3. Principal component Analysis the difference between PCA and linear regression

4. PCA Dimension Reduction Process

5. Data recovery

6, the choice of the number of principal components (that is, the dimension to be reduced)

7. Suggestions for use

8. Operation Result

9. Using PCA in Scikit-learn Library to realize dimensionality reduction

    • Seven, anomaly detection anomaly Detection

1, Gaussian distribution (normal distribution)

2. Anomaly Detection algorithm

3, the evaluation of good or bad, and the selection

4, choose what kind of feature (unit Gaussian distribution)

5, multi-Gaussian distribution

6, element and multi-Gaussian distribution characteristics

7. Program Operation result

First, linear regression

1. Cost function

which

The following is to find out the theta, so that the cost is minimal, that is, the equation that represents the distance to the real value of the nearest total m data, which represents the equation we fitted out to the real value of the square of the distance, the cause of the square is because there may be negative, the reason for the coefficient of 2 is the following gradient is for each

Code implementation:

1 # Calculate cost function 2 def Computercost (X,y,theta): 3     m = len (y)4     j = 05     # calculation cost J6     return J

Note that the x here is the real data before adding a column of 1, because there are theta (0)

2. Gradient Descent algorithm

The cost function of the biased derivative is obtained:

So the update to Theta can be written as:

Which is the learning rate, control gradient decline speed, generally take 0.01,0.03,0.1,0.3 ...

Why does gradient descent reduce the cost function progressively?

Suppose the Taylor expansion of the function f (x): F (x+x) =f (x) +f ' (x) * X+o (x), make: x=-α*f ' (x), which is the negative gradient direction multiplied by a very small step α, brings X into the Taylor expansion:

F (x+x) =f (x)-α*[f ' (x)]2+o (x)

As can be seen, α is a very small positive number, [f ' (x)]2 is a positive number, so can be derived F (x+x) <=f (x), so down the negative gradient, the function is reduced, multi-dimensional case.

1 #Gradient descent algorithm2 defgradientdescent (x,y,theta,alpha,num_iters):3m =Len (y)4n =Len (theta)5temp = Np.matrix (Np.zeros ((n,num_iters)))#staging the theta of each iteration calculation into matrix form6J_history = Np.zeros ((num_iters,1))#record the cost value for each iteration calculation7     8      forIinchRange (Num_iters):#Traversal Iteration Count9h = Np.dot (X,theta)#To calculate the inner product, the matrix can be directly multipliedTenTemp[:,i] = theta-((alpha/m) * (Np.dot (Np.transpose (X), h-y)))#Calculation of gradients Onetheta =Temp[:,i] AJ_history[i] = Computercost (X,y,theta)#call the Calculate cost function -         Print '.',       -     returnTheta,j_history

3. Normalization of the mean value

The purpose of the mean normalization is to have the data scaled to a range, making it easy to use a gradient descent algorithm

Which is the average of all this feature data, you can subtract the minimum value for this feature maximum, or the standard deviation of the data corresponding to this feature.

Code implementation:

1 #Normalization of Feature2 defFeaturenormaliza (X):3X_norm = Np.array (X)#Convert x to numpy array object to perform matrix operations4     #define the required variables5Mu = Np.zeros ((1,x.shape[1]))   6Sigma = Np.zeros ((1,x.shape[1]))7     8mu = Np.mean (x_norm,0)#average of each column (0 specified as column, 1 for Row)9Sigma = NP.STD (x_norm,0)#ask for standard deviation for each columnTen      forIinchRange (X.shape[1]):#traversing Columns OneX_norm[:,i] = (X_norm[:,i]-mu[i])/sigma[i]#Normalization A      -     returnX_norm,mu,sigma

Note that the mean normalized data is also required when predicting

4. Final running result

The cost varies with the number of iterations

5, using the linear model in the Scikit-learn library to implement

1 #-*-coding:utf-8-*-2 ImportNumPy as NP3  fromSklearnImportLinear_model4  fromSklearn.preprocessingImportStandardscaler#introduction of normalized packages5 6 deflinearregression ():7     PrintU"load data ... \ n"8data = Loadtxtandcsv_data ("Data.txt",",", Np.float64)#reading Data9X = Np.array (Data[:,0:-1],dtype=np.float64)#x corresponds to 0 to 2nd columnTeny = Np.array (Data[:,-1],dtype=np.float64)#y corresponds to the last column One          A     #Normalization Operations -Scaler =Standardscaler () - Scaler.fit (X) theX_train =scaler.transform (X) -X_test = Scaler.transform (Np.array ([1650,3])) -      -     #Linear Model fitting +Model =Linear_model. Linearregression () - Model.fit (X_train, y) +      A     #Forecast Results atresult =model.predict (x_test) -     PrintModel.coef_#coefficient of the feature coefficients in the features decision function -     PrintModel.intercept_#aka bias bias, 0 if set to False -     PrintResult#Forecast Results -  -  in #loading txt and CSV files - defLoadtxtandcsv_data (filename,split,datatype): to     returnNp.loadtxt (filename,delimiter=split,dtype=DataType) +  - #Loading NPY files the defLoadnpy_data (fileName): *     returnnp.load (fileName) $ Panax Notoginseng  - if __name__=="__main__": theLinearregression ()

How to implement common machine learning algorithms with Python-1

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.