1. Extract the surf feature of all images (the feature line of each image is not fixed, but the column is fixed 70)
2, the image is divided into two groups, a group of training, a set of tests
3, the training image is combined into a large matrix, and the matrix is clustered into 30 characteristics.
4, each image into a cluster function, it is inferred that each image belongs to several groupings (if not enough 30 groups, the back of 1)
5, each image is represented as 30 eigenvectors
6, into the logical classification for the classification of learning
7. Get training Results
#-*-coding:utf-8-*-"""Created on Thu 20:51:19 2016@author:administrator"""ImportNumPy as NPImportMahotas as MH fromMahotas.featuresImportSurf fromSklearn.linear_modelImportlogisticregression fromSklearn.metricsImport* fromSklearn.clusterImportMinibatchkmeansImportGlob#get a list of files (cat 1,dog 0)All_instance_filenames =[]all_instance_targets= [] forFinchGlob.glob ('train2/*.jpg'): Target= 1if 'Cat' inchFElse0 all_instance_filenames.append (f) all_instance_targets.append (target) surf_features=[]counter=0 forFinchAll_instance_filenames:counter= Counter+1Print 'Reading Image:', f,counter/300.0Image= Mh.imread (f, as_grey=True) Temp_image= Surf.surf (image)#[:, 5:] Printtemp_image.shape surf_features.append (temp_image)#Separation Training and testing#list the training and test images according to the lines, respectively .Train_len = Int (len (all_instance_filenames) *. 60) X_train_surf_features=np.concatenate (Surf_features[:train_len]) cou1=0 forTest1inchSurf_features[:train_len]: cou1= cou1+Test1.shape[0]PrintCOU1PrintLen (x_train_surf_features) X_test_surf_feautres=np.concatenate (Surf_features[train_len:]) Y_train=all_instance_targets[:train_len]y_test=all_instance_targets[train_len:]n_clusters= 30Print 'Clustering', Len (x_train_surf_features),'features'Estimator= Minibatchkmeans (n_clusters=n_clusters) estimator.fit_transform (x_train_surf_features)" "estimator.cluster_centers_.shapeout[18]: (30L, 70L)" "X_train= [] forInstanceinchSurf_features[:train_len]: clusters=Estimator.predict (instance) features=np.bincount (clusters)ifLen (Features) <N_clusters:features= Np.append (features, Np.zeros (1, n_clusters-Len (features)))) X_train.append (features) X_test= [] forInstanceinchSurf_features[train_len:]: Clusters=Estimator.predict (instance) features=np.bincount (clusters)ifLen (Features) <N_clusters:features= Np.append (features, Np.zeros (1, n_clusters-Len (features)))) X_test.append (features) CLF= Logisticregression (c=0.001, penalty='L2') Clf.fit_transform (X_train, y_train) predictions=clf.predict (x_test)PrintClassification_report (y_test, predictions)Print 'Precision:', Precision_score (y_test, predictions)Print 'Recall:', Recall_score (y_test, predictions)Print 'accuracy:', Accuracy_score (y_test, predictions)
A hard-to-understand cluster classification algorithm