- K-means Clustering algorithm
#-*-coding:utf-8-*-"""Created on Thu 10:59:20 2017@author:administrator"""" "There are eight major variable data on the average annual consumer spending of urban households in 31 provinces in 1999, with eight variables: food, clothing, household equipment supplies and services, health care, transportation and communications, cultural services for recreational education, residential and miscellaneous goods and services. The 31 provinces are clustered using existing data. " "ImportNumPy as NP fromSklearn.clusterImportKmeansdefLoadData (FilePath): Fr= Open (FilePath,'r+') Lines=fr.readlines () retdata=[] Retcityname= [] forLineinchLines:items= Line.strip (). Split (",") Retcityname.append (Items[0]) retdata.append ([Float (items[i] ) forIinchRange (1, Len (items)]) returnRetdata,retcitynameif __name__=='__main__': Fpath='F:\RANJIEWEN\MachineLearning\Python Machine learning combat _mooc\data\ cluster \ \'Data,cityname= LoadData (fpath+'City.txt') Miles= Kmeans (n_clusters=4) Label=km.fit_predict (data) Expenses= Np.sum (Km.cluster_centers_,axis=1) #print (expenses)Citycluster = [[],[],[],[]] forIinchRange (len (cityname)): Citycluster[label[i]].append (Cityname[i]) forIinchRange (len (citycluster)):Print("expenses:%.2f"%Expenses[i])Print(Citycluster[i])
- Dbscan Density Clustering
#-*-coding:utf-8-*-"""Created on Thu 11:14:37 2017@author:administrator"""" "the existing university campus network log data, 290 college students of the campus network usage data, data including user ID, device MAC address, IP address, start the Internet time, stop the Internet time, Internet time, Campus network package. Using existing data, the model of students ' surfing the internet is analyzed. Objective: To analyze the mode of students ' Internet time and the length of Internet by Dbscan clustering. " "ImportNumPy as NPImportSklearn.cluster as SKC fromSklearnImportMetricsImportMatplotlib.pyplot as Plt mac2id=dict () onlinetimes=[]fpath='F:\RANJIEWEN\MachineLearning\Python Machine learning combat _mooc\data\ cluster \ \'F=open (fpath+'TestData.txt', encoding='Utf-8') forLineinchF:mac=line.split (',') [2] Onlinetime=int (Line.split (',') [6]) StartTime=int (Line.split (',') [4].split (' ') [1].split (':') [0])ifMac not inchMac2id:mac2id[mac]=Len (onlinetimes) onlinetimes.append ((starttime,onlinetime) )Else: Onlinetimes[mac2id[mac]]=[(starttime,onlinetime)]real_x=np.array (onlinetimes). Reshape (( -1,2)) X=real_x[:,0:1]## Clustering Data Transformation Tips#X=np.log (1+real_x[:,1:])DB=SKC. DBSCAN (eps=0.01,min_samples=20). Fit (X) Labels=Db.labels_Print('Labels:')Print(labels) Raito=len (labels[labels[:] = =-1])/Len (labels)Print('Noise Raito:', Format (Raito,'. 2%')) N_clusters_= Len (set (labels))-(1if-1inchLabelsElse0)Print('estimated number of clusters:%d'%N_clusters_)Print("Silhouette coefficient:%0.3f"%Metrics.silhouette_score (X, labels)) forIinchRange (N_clusters_):Print('Cluster'I':') Print(List (X[labels = =I].flatten ())) Plt.hist (X,24)
- Image segmentation based on clustering
#-*-coding:utf-8-*-"""Created on Thu 15:03:11 2017@author:administrator"""" "target: Using K-means Clustering algorithm to cluster image pixel color to achieve simple image segmentation output: points in the same cluster use the same color tags, different cluster colors" "ImportNumPy as NPImportPIL. Image as Image fromSklearn.clusterImportKmeansdefLoadData (FilePath): F= Open (FilePath,'RB') Data=[] img=Image.open (f) m,n=img.size forIinchRange (m): forJinchrange (N): x, y, z=Img.getpixel ((i,j)) data.append ([x/256.0,y/256.0,z/256.0]) f.close ()returnNp.mat (data), M,n Impath='F:\RANJIEWEN\MachineLearning\Python Machine Learning combat _mooc\data\ based on clustering of the whole image division \ \'Imgdata,row,col= LoadData (impath+'bull.jpg') Label= Kmeans (n_clusters=4). Fit_predict (imgdata) label=Label.reshape ([Row,col]) pic_new= Image.new ("L", (Row, col)) forIinchrange (Row): forJinchRange (col): Pic_new.putpixel ((i,j), int (256/(label[i][j]+1)) ) Pic_new.save ("result-bull-4.jpg","JPEG")
Python Machine learning-clustering