Read about python machine learning cookbook chris albon, The latest news, videos, and discussion topics about python machine learning cookbook chris albon from alibabacloud.com
Citycluster[label[i]].append (Cityname[i]) #将每个簇的城市输出For I in range (len (citycluster)):Print ("expenses:%.2f"% expenses[i]) #将每个簇的平均花费输出Print (Citycluster[i])Click to run, you can come out results.Where the N_clusters class, the consumption level of similar cities gathered in a classExpense: The numerical plus of the central point of the cluster, that is, the average consumption levelImplementation process:1, establish the project, import Sklearn related packageImport NumPy as NPFrom Sklearn.cl
)]=1 else:print "The word:%s is not in my vocabulary!" %word return returnvecdef TRAINNBC (trainsamples,traincategory): Numtrainsamp=len (Trainsamples) NumWords=len (train Samples[0]) pabusive=sum (traincategory)/float (numtrainsamp) #y =1 or 0 feature Count P0num=np.ones (numwords) P1NUM=NP.O NES (numwords) #y =1 or 0 category count P0numtotal=numwords p1numtotal=numwords for I in Range (Numtrainsamp): if Traincategory[i]==1:p0num+=trainsamples[i] P0numtotal+=sum (Trainsamples[i]) E
attribute in the data set. The general situation is somewhere between the two.D. High-dimensional mappingMap properties to high-dimensional space. This is the most precise approach, which completely retains all the information and does not add any additional information. For example, Google, Baidu's CTR Prediction model, pre-processing will be all the variables to deal with this, up to hundreds of millions of dimensions. The benefit of this is that the entire information of the original data is
:", X) - Print("Y:", Y) - innumiterations=100000 -alpha=0.0005 toTheta=np.ones (x.shape[1]) +Theta=graientdescent (x,y,theta,alpha,x.shape[0],numiterations) - Print(Theta)Operation Result:...... Too many output data to intercept only the next more than 10 linesIteration 99988/cost:3.930135Iteration 99989/cost:3.930135Iteration 99990/cost:3.930135Iteration 99991/cost:3.930135Iteration 99992/cost:3.930135Iteration 99993/cost:3.930135Iteration 99994/cost:3.930135Iteration 99995/cost:3.930135Iterat
* (XMAT.T * (Weights *Ymat)) returnTestPoint *SigmadefLwlrtest (Testarr,xarr,yarr,k = 1.0): M=shape (Testarr) [0] Yhat=zeros (m) forIinchRange (m): Yhat[i]=LWLR (testarr[i],xarr,yarr,k)returnYhatThe LWLR () function is the code for locally weighted linear regression, and the function of the lwlrtest () function is to make the LWLR () function traverse the entire data set. We also need to draw a picture to see how the results fit. def PlotLine1 (testarr,xarr,yarr,k = 1.0 = Mat (Xarr) ymat = Ma
Scikit-learn this very powerful Python machine learning ToolkitHttp://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.htmlS1. Import dataMost of the data is formatted as M n-dimensional vectors, divided into training sets and test sets. So, knowing how to import vector (matrix) data is the most critical point. We need to use NumPy to help. Suppose the d
(file) # Open the previously saved code # File.close ()#或者自动关闭方案With open (' Pickle_exm.pickle ', ' RB ') as File:a_dic=pickle.load (file)30. Use set to find differentChar_list=[' A ', ' B ', ' C ', ' C ']print (set (char_list)) #使用set进行不同查找, output is a non-repeating sequence, sorted by hash sentence= ' Welcome to Shijiazhuang ' Print (set (sentence)) #可以分辨句子中的不同字母 and presented in a single form# 31, regular expressions (to be added)import Re #引入正则表达式pattern1 = "Cat" pattern2= ' dog ' string=
(Ss_y.inverse_transform (y_test), Ss_y.inverse_transform (lr_y_predict)) $ Print("the mean square error of the linear is:", Lr_mse) -Lr_mae =Mean_absolute_error (Ss_y.inverse_transform (y_test), Ss_y.inverse_transform (lr_y_predict)) - Print("the average absolute error of the linear is:", Lr_mae) - A #evaluation of the SGD model +Sgdr_score =Sgdr.score (x_test, y_test) the Print("the default evaluation value for SGD is:", Sgdr_score) -sgdr_r_squared =R2_score (y_test, sgdr_y_predict) $ Print("
regression tree is:", Dtr.score (X_test, y_test)) - Print("the r_squared values for the flat regression tree are:", R2_score (Y_test, dtr_y_predict)) - Print("the mean square error of the regression tree is:", Mean_squared_error (Ss_y.inverse_transform (y_test), - Ss_y.inverse_transform (dtr_y_predict))) A Print("the average absolute error of the regression tree is:", Mean_absolute_error (Ss_y.inverse_transform (y_test), + Ss_y.inverse_transform (dtr_y_predict))) the - " " $ the default evalua
.score (X_train_poly2, Y_train))#0.9816421639597427Two-time linear regression model fitted curves:The fitting degree is better than 1 linear fitting.The following 4 linear regression models are performed:1 #four-time linear regression model fitting2Poly4 = Polynomialfeatures (degree=4)#4-time polynomial feature generator3X_train_poly4 =poly4.fit_transform (X_train)4 #Building Model Predictions5Regressor_poly4 =linearregression ()6 Regressor_poly4.fit (X_train_poly4, Y_train)7 #draw a graph of 2
references: The reference is the low-dimensional matrix returned. corresponding to the input parameters of two.The number of references two corresponds to the matrix after the axis is moved.The previous picture. Green is the raw data. Red is a 2-dimensional feature of extraction.3. Code Download:Please click on my/********************************* This article from the blog "Bo Li Garvin"* Reprint Please indicate the source : Http://blog.csdn.net/buptgshengod***********************************
), 15.0*np.array (DatingLabels)) the #plt.show () - the #Unit test of Func:autonorm () the #Normmat, ranges, minvals = Autonorm (Datingdatamat) the #print (Normmat)94 #print (ranges) the #print (minvals) the the datingclasstest ()98Classifyperson ()Output:Theclassifier came back with:3, the real answer Is:3The total error rate is:0.0%Theclassifier came back with:2, the real answer Is:2The total error rate is:0.0%Theclassifier came back with:1, the real answer is:1The total error rate is:0.0%.
Python3 Learning using the APIUsing the data set on the Internet, I downloaded him to a localcan download datasets in my git: https://github.com/linyi0604/MachineLearningCode:1 ImportNumPy as NP2 ImportPandas as PD3 fromSklearn.clusterImportKmeans4 fromSklearnImportMetrics5 6 " "7 K-Mean-value algorithm:8 1 randomly selected K samples as the center of the K category9 2 from the K sample, select the nearest sample to be the same category as yourself,
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.