The nearest neighbor search algorithm for k-d tree

Source: Internet
Author: User

The K-Nearest neighbor search for data in the k-d tree is an important part of feature matching, and its purpose is to retrieve the K number points closest to the point to be queried in the k-d tree.

Nearest neighbor search is a special case of K nearest neighbor, which is 1 nearest neighbor. It is easy to extend 1 nearest neighbors to K neighbors. The simplest k-d tree nearest neighbor search algorithm is described below.

The basic idea is very simple: first through the binary tree search (compare the value of the split dimension of the node to be queried and the split node, less than equal to enter the left sub-tree branch, greater than into the right sub-tree branch until the leaf node), along the "search path" will soon find the nearest neighbor approximation Point, That is, the leaf nodes that are in the same subspace as the query point, and then backtrack through the search path and determine if there may be a data point closer to the query point in the other sub-node spaces of the node on the search path, and, if possible, skip to the other sub-node space to search (adding additional child nodes to the search path). Repeat this process until the search path is empty.

algorithm: kdtreefindnearest input: Kd target output: Nearest dist1if the KD is empty, set dist to return as Infinity2search down until the leaf node psearch.= &Kd while(Psearch! =NULL)    {Psearch added to Search_path; if(Target[psearch->split] <= psearch->dom_elt[psearch->split]) {Psearch= psearch->Left ; }    Else{Psearch= psearch->Right ; }} Remove Search_path last assignment to nearest dist=Distance (nearest, target); 3. Backtracking Search Path while{Search_path not empty) {Remove Search_path last node assigned to Pbackif(Pback->left is empty && pback->Right is empty) {     if(Distance (nearest, target) > Distance (pback->Dom_elt, Target)) {Nearest= pback->Dom_elt; Dist= Distance (pback->Dom_elt, Target); }   }   Else{s= pback->split; if(ABS (Pback->dom_elt[s]-target[s]) <Dist) {        if(Distance (nearest, target) > Distance (pback->Dom_elt, Target)) {Nearest= pback->Dom_elt; Dist= Distance (pback->Dom_elt, Target); }        if(Target[s] <= pback->Dom_elt[s]) Psearch= pback->Right ; ElsePsearch= pback->Left ; if(Psearch! =NULL) Psearch added to Search_path} }}

Now give some examples to illustrate the nearest nearest neighbor search algorithm, assuming that our k-d tree is created above through the sample set {(2,3), (5,4), (9,6), (4,7), (8,1), (7,2)}. Convert the above diagram to a tree diagram as follows:

We come to find the point (2.1,3.1), at (7,2) point test arrives (5,4), at (5,4) point test arrives (2,3), then search_path in the node for < (7,2), (5,4), (2,3), removed from the Search_path ( 2, 3) as the current best node nearest, Dist is 0.141;

Then go back to (5,4), take (2.1,3.1) as the center, draw a circle with dist=0.141 radius, and do not intersect with the hyper plane y=4, for example, so you do not have to jump to the right subspace of the node (5,4) to search because it is impossible to have a closer sample point in the right subspace.

So in the back to (7,2), the same, with (2.1,3.1) as the center, the radius of dist=0.141 to draw a circle does not intersect with the super plane x=7, so do not jump to the node (7,2) The right subspace to search .

At this point, Search_path is empty, ending the entire search, returning nearest (2,3) as the nearest neighbor of (2.1,3.1), and the closest distance is 0.141.

To give a slightly more complicated example, let's find the point (2,4.5), test arrival at (7,2), test arrival at (5,4) (4,7), then Search_path node in < (7,2), (5,4), (4,7); Removed from the Search_path (4,7) as the current best node nearest, Dist is 3.202;

Then go back to (5,4), take (2,4.5) as the center, with the radius of dist=3.202 to draw a circle with the super plane y=4 intersect, such as, so need to jump to (5,4) left dial hand space to search. So to add (2,3) to the Search_path, now the Search_path node is < (7,2), (2, 3) >; in addition, (5,4) and (2,4.5) the distance is 3.04 < dist = 3.202, so (5,4) Assigned to nearest, and dist=3.04.

Back to (2,3), (2,3) is a leaf node, directly determine whether (2,3) is closer to (2,4.5), calculate the distance is 1.5, so nearest update to (2,3), dist update to (1.5)

Back to (7,2), the same, with (2,4.5) as the center, the radius of dist=1.5 to draw a circle does not intersect with the plane x=7, so do not jump to the node (7,2) The right subspace to search .

At this point, Search_path is empty, ending the entire search, returning nearest (2,3) as the nearest neighbor of (2,4.5), and the closest distance is 1.5.

The nearest nearest neighbor for two searches is the same, but the process of searching (2, 4.5) is more complicated because (2, 4.5) is closer to the hyper-plane. The results show that the number of backtracking is greatly increased when the neighborhood of the query point and the space on both sides of the segmented hyper-plane intersect. In the worst case, the time spent searching for K-dimensional kd-tree of n nodes is:

There are many extensions to k-d tree. The researchers also proposed an improved k-d tree neighbor Search because of the large number of backtracking that would result in the performance of Kd-tree's nearest neighbor search, and one of the more famous is Best-bin-first, which obtains approximate nearest neighbors by setting the priority queue and running time-out limits. Effectively reduce the number of backtracking.

Resources

1.An intoductory tutorial on Kd-trees Andrew W.moore

2. "Image local invariant characteristics and description" Wang Yongming Wang Guijin authored defense Industry Press

3.kdtree A Simple C Library for working with Kd-trees

The nearest neighbor search algorithm for k-d tree

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.