Further exploration of mlpy, dimensionality reduction, classification, visualization

Source: Internet
Author: User
Tags svm

A very common problem is that the data encountered is multidimensional data, the dimension is too high will lead to extreme complexity of the model, the Compromise of the bill is to reduce dimensions, and then Q Cluster, classification, regression. dimensionality reduction emphasizes reducing dimensions ( selecting optimal features ) without loss of accuracy

PCA is the most common dimensionality reduction algorithm, which looks for linearly unrelated feature subsets (major factors), plus LDA (Linear discriminant analysis, linear discriminant analyses), MDS (multidimensional scaling, multidimensional scale analysis) , the citation at the end of the text is very recommended, Daniel Summary of the comparison of bereaved? Bereaved?

The following PCA methods and LIBSVM are used in the Mlpy module to reduce and classify. Note: when performing mlpy. Libsvm.learn (z,y) will error , temporarily still do not know how to deal with, here only to do record sharing, have to know the small partners must inform Acridine

1 #-*-coding:utf-8-*-2 """3 Created on Fri Oct 09:54:54 20184 5 @author: Luove6 """7 8 ImportNumPy as NP9 ImportMatplotlib.pyplot as PltTen Importmlpy One  fromMatplotlibImportcm A  -  -Filepath='D:\Analyze\Python matlab\python\datalib Py\wine.data' the defGetData (): -List1 = [Line.strip (). Split (',') forLineinchOpen (filepath,'R'). ReadLines ()] -     return[List (list2[1:14]) forList2inchLIST1],[LIST2[0] forList2inchList1] -Matrix, labels =GetData () +  -x1=[];y1=[] +X2=[];y2=[] Ax3=[];y3=[] atX=0;y=1#alcohol and malic acid attributes, respectively, indicating column index number -  forN,eleminchEnumerate (Matrix):#An enumeration dictionary is generated (auto-generate key values starting with 0 for each values list) -     ifInt (Labels[n]) = = 1:#str transform to int -X1.append (Matrix[n][x])#Extract the Alcohol attribute column values under this category, - y1.append (Matrix[n][y]) -     elifInt (Labels[n]) = = 2: in x2.append (matrix[n][x]) - y2.append (Matrix[n][y]) to     elifInt (Labels[n]) = = 3: + x3.append (matrix[n][x]) - y3.append (Matrix[n][y]) the  *Plt.scatter (x1,y1,s=50,c='Green', label='Class 1')#s control Point size $Plt.scatter (x2,y2,s=100,c='Red', label='Class 2')Panax NotoginsengPlt.scatter (x3,y3,s=200,c='darkred', label='Class 3')        -Plt.title ('Wine features', fontsize=14) thePlt.xlabel ('x Axis') +Plt.ylabel ('Y Axis') A plt.legend () thePlt.grid (true,linestyle='--', color='0.0')#color= ' 0.5 ', gray scale, value [0,1], the higher the value is closer to the gray, the lower the more white, the smaller the value of the darker + plt.show () - #reduced dimension, PCA (Principal component Analysis,principal component analysis) principal components; MDS (Multidimensional scaling) multidimensional scale analysis $Wine = Np.loadtxt (filepath,delimiter=',')#The first column is the rest of the Label column property columns $X,y=wine[:,1:6],wine[:,0].astype (np.int) - X.shape - Y.shape the  -Pca=mlpy. PCA ()#Build, InstantiateWuyiPca.learn (x)#input Data thez = pca.transform (x,k=2)#reduced to 2 dimensions - Z.shape Wu Print(Cm.cmap_d.keys ()) -Plt.scatter (z[:,0],z[:,1],c=y,s=50,cmap=cm. Reds) AboutPlt.xlabel ('First component') $Plt.ylabel ('Second Component') - plt.show () -  -Svm=mlpy. LIBSVM (kernel_type='Linear', gamma=10) A Svm.learn (x, y) +Xmin,xmax = Z[:,0].min () -0.1,z[:,0].max () +0.1 theYmin,ymax = Z[:,1].min () -0.1,z[:,1].max () +0.1 -Xx,yy = Np.meshgrid (Np.arange (xmin,xmax,0.01), Np.arange (ymin,ymax,0.01)) $Grid =np.c_ (Xx.ravel (), Yy.ravel ()) theresult =svm.pred (GRID) thePlt.pcolormesh (Xx,yy,result.reshape (xx.shape), cmap=cm. Greys_r) thePlt.scatter (z[:,0],z[:,1],c=y,s=50,cmap=cm. Reds) thePlt.xlabel ('First component') -Plt.ylabel ('Second Component') in Plt.xlim (Xmin,xmax) thePlt.ylim (Ymin,ymax)

p.s.: instance running in the execution to 61 line is error , want to know how to solve the small partner informed, thank you ~

REF:

Study notes on artificial mental retardation--machine learning (a) LDA dimensionality reduction

Detailed multidimensional scale method (Mds,multidimensional scaling)

SVM principle and derivation of "machine learning" support vector machine

"Practical Data Analysis": text and mlpy documents need to be able to pick up:https://github.com/Luove/Data

Further exploration of mlpy, dimensionality reduction, classification, visualization

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.