Geohash algorithm to quickly find the point __ algorithm in the vicinity

Source: Internet
Author: User

Http://www.cnblogs.com/LBSer/p/3310455.html


analysis of core principles of Geohash

Http://www.cnblogs.com/LBSer/p/3310455.html

Introduction

Machine is a good move and studious child, usually like to take the mobile phone map Click Click to check some fun things. One day the machine to the Beihai Park, Belly hungry, and then opened the mobile phone map, search the restaurant near Beihai Parks, and chose one of the meals.

After the meal machine began to reflect, map backstage how to query according to their location to inquire about the nearby restaurant. After a long thought, the machine came up with a way to calculate the distance between the location p and all the restaurants in Beijing, and then return to the restaurant <=1000. Small complacent for a while, the machine to find out how many restaurants in Beijing Ah, this calculation is extremely, so thought, since knows latitude and longitude degree, that it should know oneself in the Xicheng district, that should calculate the location p and Xicheng District all Restaurants distance Ah, machine use recursive thought, thought of Xicheng District also many restaurants Ah, You should calculate the distance between the location p and all the restaurants on the street so that the calculation is small and the efficiency is increased.

Machine calculation is very simple, that is, through the filtering method to reduce the number of participating in the calculation of restaurants, in some ways, the machine in the use of indexing technology.

As soon as the index was mentioned, a B-tree index was immediately emerging in our minds because a large number of databases (such as MySQL, Oracle, PostgreSQL, etc.) were using B-trees. A B-tree index is essentially a sort of indexed field, and then a quick lookup through a method like binary lookup, which requires that the indexed fields be sortable and, in general, a one-dimensional field, such as time, age, salary, and so on. But for a point in space (two-dimensional, including longitude and latitude), how to sort it. And how to index it. There are many ways to solve this problem, and a way to solve it is described below.

Idea: If you can convert two-dimensional point data into one-dimensional data in some way, then you can continue to use the B-tree index. This method really exists, the answer is yes. At present, very hot geohash algorithm is to use the above thought, below we start geohash journey.

First, perceptual Geohash

First of all, a little perceptual knowledge, http://openlocation.org/geohash/geohash-js/provides the ability to display Geohash encoding on a map.

1) Geohash converts two-dimensional latitude and longitude to a string, such as the following figure shows the Geohash strings of Beijing's 9 regions, WX4ER,WX4G2, WX4G3, and so on, each representing a rectangular region. That is, all the points in this rectangular area (latitude and longitude coordinates) share the same Geohash string, this can protect privacy (only the approximate area location rather than the specific point), but also easier to do caching, such as the upper left corner of the area of users to send location Information request restaurant data, Because these users of the Geohash strings are wx4er, so you can put wx4er as a key, the region's restaurant information as the value of the cache, and if not the use of Geohash, because the region's users from the latitude and longitude is different, it is difficult to do caching.

2 the longer the string, the more accurate the range of representations. As shown in the figure, 5-bit encoding can represent a rectangular region of 10 square kilometres, while 6-bit encodings can represent finer areas (about 0.34 square kilometers).

3 string similar representation distance (special case), so you can use string prefix matching to query the nearby POI information. As shown in the following two diagrams, one is similar to the Geohash strings in urban, suburban and urban areas, and the suburban strings are similar, while the Geohash strings in urban and suburban areas are similar in comparison.

City

Suburbs

We learned from the above that Geohash is a way to convert latitude and longitude into strings, and in most cases, the more the string prefix matches the closer the distance, back to our case, according to the location of the query to query nearby restaurants, Only the latitude and longitude of the location must be converted into geohash strings, and the geohash strings of each restaurant are prefixed to match, and the more distance is matched.

Second, the Geohash algorithm steps

Here take Beihai Park as an example to introduce the calculation steps of Geohash algorithm

2.1. Calculate Geohash binary code according to the latitude and longitude degree

The latitude zone of the earth is [ -90,90], the latitude of Beihai Park is 39.928167, the latitude 39.928167 can be encoded by the following algorithm:

1) the interval [ -90,90] is divided into [ -90,0], [0,90], known as the left and right interval, can be determined 39.928167 belongs to the right-hand interval [0,90], to mark 1;

2) then the interval [0,90] is divided into [0,45], [45,90], can be determined 39.928167 belong to the left interval [0,45), to mark 0;

3 recursion above process 39.928167 always belongs to an interval [a,b]. With each iteration interval [a,b] is always shrinking, and is approaching 39.928167 more and more;

4 if the given latitude X (39.928167) belongs to the left interval, the record 0, if the right interval is recorded 1, so as the algorithm will produce a sequence of 1011100, the length of the sequence with the given interval of the number of partitions.

According to the Latitude calculation code

Bit

Min

Mid

Max

1

-90.000

0.000

90.000

0

0.000

45.000

90.000

1

0.000

22.500

45.000

1

22.500

33.750

45.000

1

33.7500

39.375

45.000

0

39.375

42.188

45.000

0

39.375

40.7815

42.188

0

39.375

40.07825

40.7815

1

39.375

39.726625

40.07825

1

39.726625

39.9024375

40.07825

Similarly, the longitude interval of the earth is [-180,180], and longitude 116.389550 can be encoded.

According to the Longitude calculation code

123.75

td>

Bit

min

Mid

Max

1

-180

0.000

180

1

0.000

135

180

0

180

1

112.5

135

0

112.5

123.75

135

0

112.5

118.125

1

112.5

115.3125

118.125

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.