The principle, algorithm and application of Geohash-related techniques

Source: Internet
Author: User

Geohash is an address code that encodes two-dimensional latitude and longitude into a one-dimensional string. For example, Beihai Park's code is WX4G0EC1.

The principle and algorithm of Geohash

The following is an example of (39.92324, 116.3906) to introduce the Geohash coding algorithm.

First, the latitude range (-90, 90) is divided into two intervals (-90, 0), (0, 90), and if the target latitude is in the previous interval, the encoding is 0, otherwise the encoding is 1. Because 39.92324 belongs to (0, 90), the encoding is 1. The (0, 90) is then divided into (0, 45), (45, 90) Two, and 39.92324 is (0, 45), so the encoding is 0. And so on, until the precision meets the requirements, the latitude code is 1011 1000 1100 0111 1001.

Latitude Range Division interval 0 Division interval 1 39.92324 Owning Range
(-90, 90) (-90, 0.0) (0.0, 90) 1
(0.0, 90) (0.0, 45.0) (45.0, 90) 0
(0.0, 45.0) (0.0, 22.5) (22.5, 45.0) 1
(22.5, 45.0) (22.5, 33.75) (33.75, 45.0) 1
(33.75, 45.0) (33.75, 39.375) (39.375, 45.0) 1
(39.375, 45.0) (39.375, 42.1875) (42.1875, 45.0) 0
(39.375, 42.1875) (39.375, 40.7812) (40.7812, 42.1875) 0
(39.375, 40.7812) (39.375, 40.0781) (40.0781, 40.7812) 0
(39.375, 40.0781) (39.375, 39.7265) (39.7265, 40.0781) 1
(39.7265, 40.0781) (39.7265, 39.9023) (39.9023, 40.0781) 1
(39.9023, 40.0781) (39.9023, 39.9902) (39.9902, 40.0781) 0
(39.9023, 39.9902) (39.9023, 39.9462) (39.9462, 39.9902) 0
(39.9023, 39.9462) (39.9023, 39.9243) (39.9243, 39.9462) 0
(39.9023, 39.9243) (39.9023, 39.9133) (39.9133, 39.9243) 1
(39.9133, 39.9243) (39.9133, 39.9188) (39.9188, 39.9243) 1
(39.9188, 39.9243) (39.9188, 39.9215) (39.9215, 39.9243) 1

Longitude also uses the same algorithm, which is subdivided into (-180, 180) sequentially, and 116.3906 is encoded as 1101 0010 1100 0100 0100.

Longitude Range Division interval 0 Division interval 1 116.3906 Owning Range
(-180, 180) (-180, 0.0) (0.0, 180) 1
(0.0, 180) (0.0, 90.0) (90.0, 180) 1
(90.0, 180) (90.0, 135.0) (135.0, 180) 0
(90.0, 135.0) (90.0, 112.5) (112.5, 135.0) 1
(112.5, 135.0) (112.5, 123.75) (123.75, 135.0) 0
(112.5, 123.75) (112.5, 118.125) (118.125, 123.75) 0
(112.5, 118.125) (112.5, 115.312) (115.312, 118.125) 1
(115.312, 118.125) (115.312, 116.718) (116.718, 118.125) 0
(115.312, 116.718) (115.312, 116.015) (116.015, 116.718) 1
(116.015, 116.718) (116.015, 116.367) (116.367, 116.718) 1
(116.367, 116.718) (116.367, 116.542) (116.542, 116.718) 0
(116.367, 116.542) (116.367, 116.455) (116.455, 116.542) 0
(116.367, 116.455) (116.367, 116.411) (116.411, 116.455) 0
(116.367, 116.411) (116.367, 116.389) (116.389, 116.411) 1
(116.389, 116.411) (116.389, 116.400) (116.400, 116.411) 0
(116.389, 116.400) (116.389, 116.394) (116.394, 116.400) 0

The next step is to merge the longitude and latitude codes, the odd digits are latitude, and even digits are longitude, which is encoded 11100 11101 00100 01111 00000 01101 01011 00001.

Finally, the 32 letters of 0-9, b-z (remove A, I, L, O) are encoded BASE32, and the encoding of (39.92324, 116.3906) is WX4G0EC1.

Decimal 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Base32 0 1 2 3 4 5 6 7 8 9 B C D E F G
Decimal 16 17 18 19 20 21st 22 23 24 25 26 27 28 29 30 31
Base32 H J K M N P Q R S T U V W X Y Z

decoding algorithm and coding algorithm, in contrast to the first base32 decoding, and then the separation of latitude and longitude, and finally according to the binary code to the latitude range of subdivision can be, here no longer repeat. However, since Geohash represents an interval, the longer the encoding, the more accurate it is, but it is not possible to decode the exact same address.

Geohash application: Near Address search

The Geohash's best use is the nearby address search. However, it can be seen from the Geohash encoding algorithm: two points on both sides of the lattice boundary, although very close, the coding will be completely different. In practical application, we can search the 8 squares around the current lattice to solve this problem.

Finally, let's take a look at the two issues raised at the beginning of this article: slow speed, low cache hit rate. Use the Geohash query for a nearby location with a string prefix match:

Copy Code code as follows:
SELECT * from place WHERE geohash like ' wx4g0% ';

The

and prefix matching can take advantage of the index on the Geohash column, so the query speed is not too slow. In addition, even small changes in user coordinates can be encoded into the same geohash, which ensures that the cache hit rate is greatly improved by executing the same SQL statement every time.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.