Python Machine learning-clustering

Source: Internet
Author: User

    • K-means Clustering algorithm

    • Test:
#-*-coding:utf-8-*-"""Created on Thu 10:59:20 2017@author:administrator"""" "There are eight major variable data on the average annual consumer spending of urban households in 31 provinces in 1999, with eight variables: food, clothing, household equipment supplies and services, health care, transportation and communications, cultural services for recreational education, residential and miscellaneous goods and services. The 31 provinces are clustered using existing data. " "ImportNumPy as NP fromSklearn.clusterImportKmeansdefLoadData (FilePath): Fr= Open (FilePath,'r+') Lines=fr.readlines () retdata=[] Retcityname= []     forLineinchLines:items= Line.strip (). Split (",") Retcityname.append (Items[0]) retdata.append ([Float (items[i] ) forIinchRange (1, Len (items)]) returnRetdata,retcitynameif __name__=='__main__': Fpath='F:\RANJIEWEN\MachineLearning\Python Machine learning combat _mooc\data\ cluster \ \'Data,cityname= LoadData (fpath+'City.txt') Miles= Kmeans (n_clusters=4) Label=km.fit_predict (data) Expenses= Np.sum (Km.cluster_centers_,axis=1)    #print (expenses)Citycluster = [[],[],[],[]]     forIinchRange (len (cityname)): Citycluster[label[i]].append (Cityname[i]) forIinchRange (len (citycluster)):Print("expenses:%.2f"%Expenses[i])Print(Citycluster[i])
    • Dbscan Density Clustering

    • Test
#-*-coding:utf-8-*-"""Created on Thu 11:14:37 2017@author:administrator"""" "the existing university campus network log data, 290 college students of the campus network usage data, data including user ID, device MAC address, IP address, start the Internet time, stop the Internet time, Internet time, Campus network package. Using existing data, the model of students ' surfing the internet is analyzed. Objective: To analyze the mode of students ' Internet time and the length of Internet by Dbscan clustering. " "ImportNumPy as NPImportSklearn.cluster as SKC fromSklearnImportMetricsImportMatplotlib.pyplot as Plt mac2id=dict () onlinetimes=[]fpath='F:\RANJIEWEN\MachineLearning\Python Machine learning combat _mooc\data\ cluster \ \'F=open (fpath+'TestData.txt', encoding='Utf-8') forLineinchF:mac=line.split (',') [2] Onlinetime=int (Line.split (',') [6]) StartTime=int (Line.split (',') [4].split (' ') [1].split (':') [0])ifMac not inchMac2id:mac2id[mac]=Len (onlinetimes) onlinetimes.append ((starttime,onlinetime) )Else: Onlinetimes[mac2id[mac]]=[(starttime,onlinetime)]real_x=np.array (onlinetimes). Reshape (( -1,2)) X=real_x[:,0:1]## Clustering Data Transformation Tips#X=np.log (1+real_x[:,1:])DB=SKC. DBSCAN (eps=0.01,min_samples=20). Fit (X) Labels=Db.labels_Print('Labels:')Print(labels) Raito=len (labels[labels[:] = =-1])/Len (labels)Print('Noise Raito:', Format (Raito,'. 2%')) N_clusters_= Len (set (labels))-(1if-1inchLabelsElse0)Print('estimated number of clusters:%d'%N_clusters_)Print("Silhouette coefficient:%0.3f"%Metrics.silhouette_score (X, labels)) forIinchRange (N_clusters_):Print('Cluster'I':')    Print(List (X[labels = =I].flatten ())) Plt.hist (X,24)
    • Image segmentation based on clustering

    • Test
#-*-coding:utf-8-*-"""Created on Thu 15:03:11 2017@author:administrator"""" "target: Using K-means Clustering algorithm to cluster image pixel color to achieve simple image segmentation output: points in the same cluster use the same color tags, different cluster colors" "ImportNumPy as NPImportPIL. Image as Image fromSklearn.clusterImportKmeansdefLoadData (FilePath): F= Open (FilePath,'RB') Data=[] img=Image.open (f) m,n=img.size forIinchRange (m): forJinchrange (N): x, y, z=Img.getpixel ((i,j)) data.append ([x/256.0,y/256.0,z/256.0]) f.close ()returnNp.mat (data), M,n Impath='F:\RANJIEWEN\MachineLearning\Python Machine Learning combat _mooc\data\ based on clustering of the whole image division \ \'Imgdata,row,col= LoadData (impath+'bull.jpg') Label= Kmeans (n_clusters=4). Fit_predict (imgdata) label=Label.reshape ([Row,col]) pic_new= Image.new ("L", (Row, col)) forIinchrange (Row): forJinchRange (col): Pic_new.putpixel ((i,j), int (256/(label[i][j]+1)) ) Pic_new.save ("result-bull-4.jpg","JPEG")

Python Machine learning-clustering

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.