KD Tree seeking k nearest Neighbor Python code

Source: Internet
Author: User

Two previous essays introduced the principle of KD tree, and using Python to achieve the construction and search of KD tree, in particular, can refer to

  the principle of KD tree

  Python kd Tree Search code

  KD trees are often associated with the KNN algorithm, and the KNN algorithm usually searches for K neighbors, not just the nearest neighbor, and the following code uses the KD tree to search for the K nearest neighbors of the target point.

The first is to create a class that holds the values of the nodes, the left and right subtrees, and the split axes used to divide the left and right subtrees.

class Decisionnode:     def __init__ (self,value=none,col=none,rb=none,lb=None):        self.value=value        self.col=Col        SELF.RB=RB        self.lb=lb

The segmentation point is the median value on the axis, and the following code evaluates the median of a sequence

def median (x):    n=len (x)    x=list (x)    X_order=sorted (x    )  return x_order[n//2],x.index (X_ORDER[N//2])

Then the KD tree is constructed according to the rules of the left dial hand tree which is larger than the Shard point and the right subtree is less than the Shard point, where data is the input

#divide data by the median of J column, left small right large, j= node depth% column numberdefBuildtree (x,j=0): RB=[] lb=[] M,n=X.shapeifM==0:returnNone Edge,row=median (x[:,j].copy ()) forIinchRange (m):ifX[i][j]>edge:rb.append (i)ifx[i][j]<edge:lb.append (i) rb_x=X[RB,:] lb_x=x[lb,:] Rightbranch=buildtree (rb_x, (j+1)%N) leftbranch=buildtree (lb_x, (j+1)%N)returnDecisionnode (X[row,:],j,rightbranch,leftbranch)

Next is the process of searching the tree for the K nearest neighbor, which is roughly the same as the process of searching for the nearest neighbor, creating a dictionary knears that stores the points of the K nearest neighbor and the distance from the target point (Euclidean distance)

The search process is:

(1) The first step is to traverse the tree to find the corresponding leaf node of the target area.

(2) from the leaf node up and down, in accordance with the method of finding the nearest neighbor to fall back to the parent node, and to determine whether another child node may exist in the region K neighbors, specifically, at each node to do the following:

(a) If the number of members in the dictionary is less than k, add the node to a dictionary

(b) If the number of members in the dictionary is not less than K, determine whether the distance between the node and the target node is not greater than the maximum value of the distance corresponding to each node in the dictionary, and if not greater, add it to the dictionary

(c) For parent nodes, if the distance between the target point and its shard axis is not greater than the maximum of the distance corresponding to each node in the dictionary, then another child node of the parent node needs to be accessed

(3) Whenever a new member is added to the dictionary, the dictionary is sorted in descending order by distance value, and the resulting list is assigned to POINELIST,POINTLIST[0][1] is the maximum value of the corresponding distance of each node in the dictionary.

(4) When you fall back to the root node and complete the operation, the Pointlist is the K nearest neighbor of the target point.

The code is as follows:

#search tree: The neighbor point of the output target pointdefTraveltree (node,aim):GlobalPointlist#store sorted k nearest neighbors and corresponding distances    ifNode==none:returnCol=Node.colifAim[col]>Node.value[col]: Traveltree (node.rb,aim)ifaim[col]<Node.value[col]: traveltree (node.lb,aim) Dis=Dist (Node.value,aim)ifLen (Knears) <K:knears.setdefault (Tuple (Node.value.tolist ()), dis)#list cannot be a key for a dictionaryPointlist=sorted (Knears.items (), key=LambdaItem:item[1],reverse=True)elifDis<=pointlist[0][1]: Knears.setdefault (Tuple (Node.value.tolist ()), dis) pointlist=sorted (Knears.items (), key=LambdaItem:item[1],reverse=True)ifNode.rb!=noneornode.lb!=None:ifABS (Aim[node.col]-node.value[node.col]) < pointlist[0][1]:            ifaim[node.col]<Node.value[node.col]: Traveltree (node.rb,aim)ifAim[node.col]>Node.value[node.col]: Traveltree (node.lb,aim)returnPointlist

  The full code is taken here

1 ImportNumPy as NP2  fromNumPyImportArray3 classDecisionnode:4     def __init__(self,value=none,col=none,rb=none,lb=None):5Self.value=value6Self.col=Col7self.rb=RB8self.lb=lb9         Ten #read data and convert data to matrix form One defreaddata (filename): AData=open (filename). ReadLines () -x=[] -      forLineinchData: theLine=line.strip (). Split ('\ t') -X_i=[] -          forNuminchLine : -num=float (num) + x_i.append (num) - x.append (x_i) +x=Array (x) A     returnx at  - #to find the median of a sequence - defmedian (x): -n=len (x) -x=list (x) -X_order=sorted (x) in     returnX_order[n//2],x.index (X_ORDER[N//2]) -  to #divide data by the median of J column, left small right large, j= node depth% column number + defBuildtree (x,j=0): -rb=[] thelb=[] *m,n=X.shape $     ifM==0:returnNonePanax Notoginsengedge,row=median (x[:,j].copy ()) -      forIinchRange (m): the         ifX[i][j]>Edge: + rb.append (i) A         ifx[i][j]<Edge: the lb.append (i) +rb_x=X[RB,:] -lb_x=x[lb,:] $Rightbranch=buildtree (rb_x, (j+1)%N) $Leftbranch=buildtree (lb_x, (j+1)%N) -     returnDecisionnode (x[row,:],j,rightbranch,leftbranch) -  the #search tree: The neighbor point of the output target point - defTraveltree (node,aim):Wuyi     GlobalPointlist#store sorted k nearest neighbors and corresponding distances the     ifNode==none:return  -Col=Node.col Wu     ifAim[col]>Node.value[col]: - Traveltree (Node.rb,aim) About     ifaim[col]<Node.value[col]: $ Traveltree (Node.lb,aim) -dis=Dist (Node.value,aim) -     ifLen (Knears) <K: -Knears.setdefault (Tuple (Node.value.tolist ()), dis)#list cannot be a key for a dictionary APointlist=sorted (Knears.items (), key=LambdaItem:item[1],reverse=True) +     elifDis<=pointlist[0][1]: the Knears.setdefault (Tuple (Node.value.tolist ()), dis) -Pointlist=sorted (Knears.items (), key=LambdaItem:item[1],reverse=True) $     ifNode.rb!=noneornode.lb!=None: the         ifABS (Aim[node.col]-node.value[node.col]) < pointlist[0][1]: the             ifaim[node.col]<Node.value[node.col]: the Traveltree (Node.rb,aim) the             ifAim[node.col]>Node.value[node.col]: - Traveltree (Node.lb,aim) in     returnpointlist the           the defDist (x1, x2):#calculation of Euclidean distance About     return((Np.array (x1)-Np.array (x2)) * * 2). SUM () * * 0.5 the  theknears={} theK=int (Input ('Please enter the value of K')) + ifK&LT;2:Print('K can't be 1') - Globalpointlist thepointlist=[]BayiFile=input ('Please enter the data file address') theData=readdata (file) theTree=buildtree (data) -Tmp=input ('Please enter a target point') -Tmp=tmp.split (',') theaim=[] the  forNuminchtmp: thenum=float (num) the aim.append (num) -aim=tuple (AIM) thepointlist=Traveltree (Tree,aim) the  forPointinchpointlist[-K:]: the     Print(point)
Kdtree

KD Tree seeking k nearest Neighbor Python code

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.