"cs231n" Job 1 question 1 Selection _ code understanding k Nearest Neighbor Algorithm & cross-validation Select parameter parameters

Source: Internet
Author: User

Probe into the acceleration of numpy vector operation by K nearest neighbor algorithm

Anise Bean's "anise" word has ...

The k nearest neighbor algorithm is implemented using three ways to calculate the image distance:

1. The most basic double cycle

2. Using the BROADCA mechanism of numpy to realize single cycle

3. Using the mathematical properties of broadcast and matrices to achieve non-cyclic

The picture is stretched into a one-dimensional array

X_train: (Train_num, one-dimensional array)

X: (Test_num, one-dimensional array)

Method validation
Import NumPy as NPA = Np.array ([[[1,1,1],[2,2,2],[3,3,3]]) b = Np.array ([[4,4,4],[5,5,5],[6,6,6],[7,7,7]])
Double loop:
Dists = Np.zeros ((3,4)) for I in a range (3): for    J in range (4):        dists[i][j] = np.sqrt (Np.sum (Np.square (a[i)-b[j])) Print (dists)

[[5.19615242 6.92820323 8.66025404 10.39230485]
[3.46410162 5.19615242 6.92820323 8.66025404]
[1.73205081 3.46410162 5.19615242 6.92820323]]

Single cycle:
Dists=np.zeros ((3,4)) for I in Range (3):    dists[i] = np.sqrt (Np.sum (Np.square (a[i)-B), Axis=1)) print (dists)

[[5.19615242 6.92820323 8.66025404 10.39230485]
[3.46410162 5.19615242 6.92820323 8.66025404]
[1.73205081 3.46410162 5.19615242 6.92820323]]

No loops:
R1= (Np.sum (Np.square (a), Axis=1) * (Np.ones ((b.shape[0],1))). Tr2=np.sum (Np.square (b), Axis=1) * (Np.ones ((a.shape[0],1))) R3=-2*np.dot (a,b.t) print (NP.SQRT (R1+R2+R3))

[[5.19615242 6.92820323 8.66025404 10.39230485]
[3.46410162 5.19615242 6.92820323 8.66025404]
[1.73205081 3.46410162 5.19615242 6.92820323]]

Principle of non-cyclic algorithm:

(Note that the schematic-validation code-the implementation of the variable is not strictly one by one corresponding, there are adjustments)

The full code is implemented as follows:

Import NumPy as Npclass Knearsneighbor (): Def _init_ (self): Pass Def train (self, x, y): self. X_train = x self.y_train = y # Select the way to calculate the distance def predict (self, x, k=1, num_loops=0) using several loop bodies: if Num_loop s = = 0:dist = Self.compute_distances_no_loops (X) elif Num_loops = = 1:dist = Self.compute_di            Stances_one_loops (x) elif Num_loops = = 2:dist = Self.compute_distances_two_loops (x) Else:        Raise ValueError (' Invalid value%d '% num_loops) return dist def compute_distances_two_loops (self, X): Num_test = x.shape[0] Num_train = self. X_train.shape[0] dists = Np.zeros ((num_test, Num_train)) for I in Range (Num_test): for J in range (Num_train): dists[i][j] = np.sqrt (Np.sum (Np.square (x[i)-Self. X_TRAIN[J])) return dists def compute_distances_one_loops (self, X): Num_test = x.shape[0] Num_tra in = self. X_TRAIN.SHAPE[0]       Dists = Np.zeros ((num_test,num_train)) for I in Range (num_test): dists[i] = np.sqrt (Np.sum (NP . Square (X[i]-Self.  X_train), Axis=1) return dists def compute_distances_no_loops (self, X): # num_test = x.shape[0] # Num_train = self. X_TRAIN.SHAPE[0] # dists = Np.zeros ((num_test,num_train)) dists = Np.sqrt ( -2*np.dot (X, self. X_train. T) + np.sum (Np.square (self). X_train), Axis=1) * (Np.ones ((x.shape[0],1))) + Np.sum (Np.square (X), Axis=1) * (Np.ones (x_train). shape[0],1)). T) return dists # Predictive label def predict_labels (self, dists, k=1): Num_test = dists.shape[0] y_pred = Np.zeros (num_test) for I in Range (num_test): closest_y = Self.y_train[np.argsort (Dists[i]) [: K]] # "" "Follow Sort index by Distance "fetch nearest K index" follow index for training label "y_pred[i" = Np.argmax (Np.bincount (closest_y)) # Poll, note Np.bincount () and Np.argmax () The magical return y_pred on the ballot
Cross-validation selects the value of the super-parameter K

We have implemented(implemented) the k-nearest Neighbor classifier(category) but we set the value k = 5 arbitrarily(arbitrarily). We'll now determine the best value of this hyperparameter with cross-validation(cross-validation).

Import NumPy as Npnum_folds = 5k_choices = [1, 3, 5, 8, ten, $,,, 100]x_train_folds = []y_train_folds = []######]                                                                        ########################################################################### TODO: # # Split up the training data into folds. After splitting, X_train_folds and # # Y_train_folds should each being lists of length num_folds, where # # Y     _train_folds[i] is the label vector for the points in X_train_folds[i].                                # # Hint:look up the NumPy array_split function. ################################################################################ #X_train_folds = Np.split (X_ Train, num_folds) Y_train_folds = Np.split (Y_train, Num_folds) ##################################################### ############################ END of YOUR CODE ################ ################################################################## A Dictionary holding the accuracies for different values of K so we find# when running cross-validation. After running cross-validation,# K_to_accuracies[k] should is a list of length num_folds giving the different# accuracy VA Lues that we found if using that value of K.k_to_accuracies = {}####################################################### ########################## TODO: # Perform K-fold Cross validation to find the best value of K. For each # # possible value of K, run the K-nearest-neighbor algorithm num_folds times, #. Use all but one of the folds as training data and the # # last fold as a validation set.                               Store the accuracies for all fold and all # # values of K in the K_to_accuracies dictionary. ################################################################################ #for K in K_choices:k_to_accuraci Es[k]=np.zeros (Num_folds) for I in Range (num_folds): Xtr = Np.concatenate ((Np.array (X_train_folds) [: I],np.array (X_train_fol DS) [(i+1):]), axis=0) Ytr = Np.concatenate ((Np.array (Y_train_folds) [: I],np.array (Y_train_folds) [(i+1):]) , axis=0) Xte = Np.array (X_train_folds) [i] yte = Np.array (Y_train_folds) [i] # [Num_of_folds, N Um_in_flods, feature_of_x], [Num_of_pictures, feature_of_x] Xtr = Np.reshape (Xtr, (x_train.shape[0] * 4/5,-        1)) Ytr = Np.reshape (Ytr, (y_train.shape[0] * 4/5,-1)) Xte = Np.reshape (Xte, (X_train.shape[0]/5,-1)) Yte = Np.reshape (Yte, (Y_train.shape[0]/5,-1)) Classifier.train (XTR, ytr) yte_pred = Class  Ifier.predict (Xte, k) yte_pred = Np.reshape (yte_pred, (Yte_pred.shape[0],-1)) accuracy = np.sum (yte_pred = = Yte, Dtype=float)/len (yte) # bool to int, we need to display specified as float k_to_accuracies[k][i] = accuracy######################## ######################################################### END of YOUR CODE ########### ####################################################################### Print out the computed accuraciesfor K in Sorted (k_to_accuracies): For accuracy in k_to_accuracies[k]: print ' k =%d, accuracy =%f '% (k, accuracy)

"cs231n" Job 1 question 1 Selection _ code understanding k Nearest Neighbor Algorithm & cross-validation Select parameter parameters

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.