KNN basic steps: Calculate the distance from the known data, select the K nearest distance of the data, see the K-data label most of what class, predict the classification of unknown data
1. Create a new knn.py module
Need to use NumPy
from Import *import operator
2. Add known data and tags to the module
def CreateDataSet (): Group=array ([[[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]]) labels=[' A 'A','b','b ' '] return Group, labels
3. Classification of unknown data
defclassify0 (InX, dataset,labels,k):#calculate the distance between the input data and all data that is already availableDatasetsize=Dataset.shape[0] Diffmat=tile (InX, (datasetsize,1))-DataSet Sqdiffmat=diffmat**2sqdistances=sqdiffmat.sum (Axis=1)#No axis parameter means total addition, axis=0 means adding by column, axis=1 means adding in line directiondistances=sqdistances**0.5#SortSorteddistindex=distances.argsort ()#Argsort arranges data from small to large and returns its index value #Select K points with a minimum distanceclasscount={}#Dictionary type forIinchRange (k): Votelabel=Labels[sorteddistindex[i]] Classcount[votelabel]=classcount.get (votelabel,0) +1Sortedclasscount=sorted (Classcount.items (), Key=operator.itemgetter (1), reverse=True)returnSORTEDCLASSCOUNT[0][0]
4. Test the Model
Import knngroup,labels=knn.createdataset () Newlabel=knn.classify0 ([0,0],group,labels,3) Print(Newlabel)
The result is B.
You can also test other data; This is the simplest KNN example, referring to machine learning combat;
Mainly for their own understanding and memory (beginner)
Some of the uses of Python are documented here for easy reference.
- The shape function is a function in Numpy.core.fromnumeric, whose function is to read the length of the matrix, for example, Shape[0] is to read the length of the first dimension of the matrix. Its input parameters can make an integer representation of a dimension, or it can be a matrix.
Enter a number, return null, can quickly read the shape of an array, can read only one-dimensional length;
shape (3) out[ 4]: () e =eye (3) eout[ 6]: Array ([[ 1 1 1 7]: (3, 3 8]: 3
The
- tile function is in the Python module numpy.lib.shape_base, and his function is to repeat an array. For example, Tile (a,n), the function is to repeat array a n times, to form a new array
-
a=[0,1,2]b =tile (a,2) bout[: Array ([0, 1, 2, 0, 1, 2 =tile (A, (1,2< Span style= "color: #000000")) cout[]: Array ([[[0, 1, 2, 0, 1, 2 =tile (a , (3,1 16]: Array ([[[0, 1, 2 ", [0, 1, 2", [0, 1, 2]])
- su M
-
sum (iterable[, start])
- iterable--can iterate over objects, such as lists.
- Start-Specifies the parameters to add, and if this value is not set, the default is 0
- . Sum
Aa=array ([[0,1,2],[3,4,5]]); Aa.sum () out[(): aa.sum (Axis=0) out[[]: Array ([3, 5, 7 ]) aa.sum (axis=1) out[]: Array ([3, 12])
By default all numbers are added, axis=0, columns are added, Axis=1, rows are added.
Argsort (A, axis=-1, kind='quicksort', Order=none)
Examples can be found in: 21875705
A is the array that needs to be sorted, and the index value is returned
Distance=array ([0.3,0.5,0.8,0.4]) Index=distance.argsort () indexout[(): Array ([0, 3, 1, 2], Dtype=int64)
Aa=array ([[[0,1,2],[3,4,5]]); x=aa.argsort (axis=0) xout[: Array ([[[0], 0, 0], [1, 1, 1 ], dtype=Int64) x=aa.argsort (Axis=1) xout[1, 2 ],1, 2]], Dtype=int64)
axis=0 by column; Axis=1 by line;
A dictionary is another mutable container model and can store any type of object.
Each key value of the dictionary is key=>value with a colon : Split, with a comma, split between each key-value pair, and the entire dictionary is enclosed in curly braces {}, as shown in the following format:
D = {key1:value1, key2:value2}
Dict = {'a': 1,'b': 2,'b':'3'}dict['b']out[34]:'3'dictout[35]: {'a': 1,'b':'3'}
The key is generally unique, if the repetition is overwritten by the latter, but the value is not unique;
For a detailed explanation see: http://www.cnblogs.com/woshitianma/p/3222989.html
Sorted (Iterable[,cmp,[,key[,reverse=true]])
Students = [('John','A', 15), ('Jane','B', 12), ('Dave','B', 10),]sorted (students, key=LambdaStudent:student[2]) out[42]: [('Dave','B', 10), ('Jane','B', 12), ('John','A', 15)] fromoperatorImportItemgetter, attrgettersorted (students, key=itemgetter (2)) out[44]: [('Dave','B', 10), ('Jane','B', 12), ('John','A', 15)]sorted (students, key=itemgetter ()) out[45]: [('John','A', 15), ('Dave','B', 10), ('Jane','B', 12)]
Sort the dictionary, return list, no longer a dictionary
D = {'data1': 3,'data2': 1,'data3': 2,'data4': 4}sorted (D.items (), Key=itemgetter (1), reverse=True) out[47]: [('data4', 4), ('data1', 3), ('data3', 2), ('data2', 1)]sorted (D.items (), Key=itemgetter (1)) out[48]: [('data2', 1), ('data3', 2), ('data1', 3), ('data4', 4)]
Dict = {'Google':'www.google.com','Runoob':'www.runoob.com','Taobao':'www.taobao.com'}dict.items () out[(a): Dict_items ([('Google','www.google.com'), ('Runoob','www.runoob.com'), ('Taobao','www.taobao.com')])
The Python dictionary (Dictionary) items () function returns an array of traversed (key, value) tuples in a list.
New version no longer uses Iteritems
[1] python implements simple KNN