Parallel implementation of the KNN algorithm of "Cuda parallel programming Six"

Source: Internet
Author: User
Tags cmath

I wrote two articles before. One is the C + + serial implementation of the KNN algorithm, and the other is the Euclidean distance of the Cuda computational vector. Then this article can be said to be a simple integration of the first two articles. You can read the first two articles before reading this article.


First, generate a data set

Now we need to generate a n d-dimensional data, not a group of data have a class label, this class is labeled according to the first dimension of positive and negative to identify the sample data of the class label: Positive and negative.

#!/usr/bin/pythonimport reimport sysimport Randomimport osfilename = "Input.txt" if (os.path.exists (filename)):p rint (" %s exists and del "% filename" os.remove (filename) fout = open (filename, "w") for I in Range (0,int (sys.argv[1])): #str to In tx = []for j in Range (0,int (sys.argv[2])): X.append ("%4f"% Random.uniform ( -1,1)) #generate random data and limit the dig Its into 4fout.write ("%s\t"% x[j]) #fout. Write (x): typeerror:expected A character buffer object if (x[0][0] = = '-'): FOUT.W Rite ("negative" + "\ n") else:fout.write ("Positive" + "\ n") Fout.close ()

Run the program to generate 4,000 dimensions of 8 data:


The file "Input.txt" was generated:



Second, serial code:

This code is consistent with the previous article code, we select 400 data to be used as test data, 3,600 data for training data.

knn_2.cc:

#include <iostream> #include <map> #include <vector> #include <stdio.h> #include <cmath># include<cstdlib> #include <algorithm> #include <fstream>using namespace std;typedef string Tlabel; typedef double TDATA;TYPEDEF pair<int,double> pair;const int maxcollen = 10;const int maxrowlen = 10000;ifstream fi N;class knn{private:tdata Dataset[maxrowlen][maxcollen];tlabel Labels[maxrowlen];tdata testData[MaxColLen];int Rowlen;int collen;int k;int test_data_num;map<int,double> map_index_dis;map<tlabel,int> map_label_freq; Double get_distance (tdata *d1,tdata *d2);p ublic:knn (int k, int rowlen, int collen, char *filename); void Get_all_distanc E (); Tlabel Get_max_freq_label (); void Auto_norm_data (); void get_error_rate (); struct cmpbyvalue{bool operator () (const pair& lhs,const pair& RHS) {return Lhs.second < Rhs.second;}}; ~KNN ();}; KNN::~KNN () {fin.close (); Map_index_dis.clear (); Map_label_freq.clear ();} KNN::KNN (int k, int row, int col, Char *filename) {This->rowlen = Row;this->collen = col;this->k = K;test_data_num = 0;fin.open (filename); if (!fin) {Co ut<< "Can not open the file" <<endl;exit (0);} Read data from filefor (int i=0;i<rowlen;i++) {for (int j=0;j<collen;j++) {fin>>dataset[i][j];} Fin>>labels[i];}} void KNN:: Get_error_rate () {int i,j,count = 0;tlabel label;cout<< "Please input the number of test data:" <<en Dl;cin>>test_data_num;for (i=0;i<test_data_num;i++) {for (j=0;j<collen;j++) {testData[j] = DataSet[i][j] ;} Get_all_distance (); label = Get_max_freq_label (); if (Label!=labels[i]) count++;map_index_dis.clear (); map_label_ Freq.clear ();} cout<< "The error rate is =" << (double) count/(double) Test_data_num<<endl;} Double KNN:: Get_distance (Tdata *d1,tdata *d2) {Double sum = 0;for (int i=0;i<collen;i++) {sum + = POW ((D1[i]-d2[i]), 2) ;} cout<< "The sum is =" <<sum<<endl;return sqrt (sum);} Get distance between TestData and all DatasetvoiD KNN:: Get_all_distance () {double distance;int i;for (i=test_data_num;i<rowlen;i++) {distance = Get_distance ( Dataset[i],testdata); map_index_dis[i] = distance;}} Tlabel KNN:: Get_max_freq_label () {vector<pair> Vec_index_dis (Map_index_dis.begin (), Map_index_dis.end ()); Sort (Vec_index_dis.begin (), Vec_index_dis.end (), Cmpbyvalue ()), and for (int i=0;i<k;i++) {/*cout<< "The index =" <<vec_index_dis[i].first<< "The distance =" <<vec_index_dis[i].second<< "the label =" << labels[Vec_index_dis[i].first]<< "the Coordinate ("; int j;for (j=0;j<collen-1;j++) {cout<<dataset[VEC _index_dis[i].first][j]<< ",";} cout<<dataset[vec_index_dis[i].first][j]<< ")" <<endl;*/map_label_freq[labels[Vec_index_dis[i] . first]]++;} Map<tlabel,int>::const_iterator map_it = Map_label_freq.begin (); Tlabel label;int max_freq = 0;while (map_it! = Map _label_freq.end ()) {if (Map_it->second > Max_freq) {max_freq = Map_it->second;labeL = Map_it->first;} map_it++;} cout<< "The test data belongs to the" <<label<< "label" <<endl;return label;} void Knn::auto_norm_data () {tdata maxa[collen]; Tdata Mina[collen]; tdata Range[collen]; int i,j;for (i=0;i<colLen;i+ +) {Maxa[i] = max (dataset[0][i],dataset[1][i]); Mina[i] = min (Dataset[0][i],dataset[1][i]);} for (i=2;i<rowlen;i++) {for (j=0;j<collen;j++) {if (Dataset[i][j]>maxa[j]) {maxa[j] = dataset[i][j];} else if (Dataset[i][j]<mina[j]) {mina[j] = Dataset[i][j];}}}  for (i=0;i<collen;i++) {range[i] = Maxa[i]-mina[i];//normalize the test data settestdata[i] = (Testdata[i]-mina[i] )/range[i];} Normalize the training data setfor (i=0;i<rowlen;i++) {for (j=0;j<collen;j++) {dataset[i][j] = (Dataset[i][j]- MINA[J])/range[j];}} int main (int argc, char** argv) {int K,row,col;char *filename;if (argc!=5) {cout<< "The input should is like this:. /a.out k row col filename "<<endl;exit (1);} K = Atoi (argv[1]); row = Atoi (argv[2]); col = AToi (argv[3]); filename = argv[4]; KNN KNN (k,row,col,filename); Knn.auto_norm_data (); Knn.get_error_rate (); return 0;}
Makefile

target:g++ knn_2.cc./a.out 7 4000 8 INPUT.TXTCU:NVCC knn.cu./a.out 7 4000 8 Input.txt

Operation Result:



Third, parallel implementation

Parallel implementation of the process is not a test sample to the distance of N training samples to parallelize, if the serial computation, the time complexity is: O (n*d), if the serial computation, the time complexity is O (d), in fact, D is the dimension of the data.

KNN.CU:

#include <iostream> #include <map> #include <vector> #include <stdio.h> #include <cmath># include<cstdlib> #include <algorithm> #include <fstream>using namespace std;typedef string Tlabel; typedef float TDATA;TYPEDEF pair<int,double> pair;const int maxcollen = 10;const int maxrowlen = 10010;const int te St_data_num = 400;ifstream Fin;class knn{private:tdata dataset[maxrowlen][maxcollen];tlabel labels[MaxRowLen];tData Testdata[maxcollen];tdata trainingdata[3600][8];int rowlen;int collen;int k;map<int,double> map_index_dis;map <tLabel,int> map_label_freq;double get_distance (tdata *d1,tdata *d2);p ublic:knn (int k, int rowlen, int collen, CH Ar *filename); void Get_all_distance (); Tlabel Get_max_freq_label (); void Auto_norm_data (); void Get_error_rate (); void Get_training_data (); struct cmpbyvalue{bool operator () (const pair& Lhs,const pair& RHS) {return Lhs.second < Rhs.second;}}; ~KNN ();}; KNN::~KNN () {fin.close (); Map_index_dis.clear ();Map_label_freq.clear ();} KNN::KNN (int k, int row, int col, char *filename) {This->rowlen = Row;this->collen = Col;this->k = K;fin.open (fi Lename); if (!fin) {cout<< "Can not open the file" <<endl;exit (0);} for (int i=0;i<rowlen;i++) {for (int j=0;j<collen;j++) {fin>>dataset[i][j];} Fin>>labels[i];}} void KNN:: Get_training_data () {for (int. i=test_data_num;i<rowlen;i++) {for (int j=0;j<collen;j++) {trainingdata[ I-TEST_DATA_NUM][J] = Dataset[i][j];}} void KNN:: Get_error_rate () {int i,j,count = 0;tlabel label;cout<< "The test data number is:" &LT;&LT;TEST_DATA_NUM&L T;<endl;get_training_data ();//get testing data and Calculatefor (i=0;i<test_data_num;i++) {for (j=0;j<colLen ; j + +) {Testdata[j] = dataset[i][j];} Get_all_distance (); label = Get_max_freq_label (); if (Label!=labels[i]) count++;map_index_dis.clear (); map_label_ Freq.clear ();} cout<< "The error rate is =" << (double) count/(double) Test_data_num<<endl;} Global function__global__ Void Cal_dis (tdata *train_data,tdata *test_data,tdata* dis,int pitch,int N, int D) {int tid = blockidx.x;if (Tid<N) {TData te MP = 0;tdata sum = 0;for (int i=0;i<d;i++) {temp = * ((tdata*) ((char*) train_data+tid*pitch) +i)-Test_data[i];sum + = Temp * TEMP;} Dis[tid] = sum;}} Parallel Calculate the Distancevoid KNN:: Get_all_distance () {int height = Rowlen-test_data_num;tdata *distance = new T Data[height];tdata *d_train_data,*d_test_data,*d_dis;size_t pitch_d; size_t pitch_h = colLen * sizeof (TDATA);// Allocate memory on Gpucudamallocpitch (&d_train_data,&pitch_d,collen*sizeof (tdata), height); Cudamalloc ( &d_test_data,collen*sizeof (Tdata)); Cudamalloc (&d_dis, height*sizeof (Tdata)); Cudamemset (d_train_data,0, Height*collen*sizeof (Tdata)); Cudamemset (D_test_data,0,collen*sizeof (Tdata)); Cudamemset (D_dis, 0, Height*sizeof ( Tdata));//copy training and testing data from the host to Devicecudamemcpy2d (D_train_data,pitch_d,trainingdata,pitch_h, Collen*sizeof (tdata), Height,cudameMcpyhosttodevice); cudamemcpy (D_test_data,testdata,collen*sizeof (tdata), cudamemcpyhosttodevice);//calculate the Distancecal_dis<<

Operation Result:


Because of the memory allocation problem (previously mentioned), it is not convenient to trainingdata the training data into a static space allocation.

As you can see, the result is exactly the same as the test data set and the training dataset. The amount of data is small and there is no time performance comparison. There is also a way to improve the ability to load all testdata into video memory at once, rather than one load, which reduces the number of times the training data is copied to the memory to improve efficiency.


Author: Yi Solo Show

Email:[email protected]

Annotated Source: http://blog.csdn.net/lavorange/article/details/42172451


Parallel implementation of the KNN algorithm of "Cuda parallel programming Six"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.