Turn: http://www.wubiao.info/tag/geohash
With the popularization of mobile terminals, many applications are based on the LBS function, near XX (restaurants, banks, sister paper, etc ).
In basic data, the longitude and latitude of the target location are generally stored. the longitude and latitude provided by the user are used for comparison to determine whether the target location is nearby.
Objectives:
Search for the nearby XXX and return the result from near to far. The result contains the distance from the target point.
Two solutions are proposed for finding XXX nearby:
I. Solution:
========================================================== ========================================================== ========================
It is abstracted as the calculation of the distance between two points on the sphere, that is, the latitude and longitude of two points on the sphere have been known;
Point (latitude, longitude), A ($ radlat1, $ radlng1), B ($ radlat2, $ radlng2 );
Advantages: easy to understand and easy to deploy
Disadvantage: the database is queried every time, and the performance is worrying.
1. Derivation
Through cosine theorem and radian calculation method, the formula A is:
$s = acos(cos($radLat1)*cos($radLat2)*cos($radLng1-$radLng2)+sin($radLat1)*sin($radLat2))*$R;
Currently, most Distance Computing companies published on the internet use Google. The formula B is as follows:
$s = 2*asin(sqrt(pow(sin(($radLat1-$radLat2)/2),2)+cos($radLat1)*cos($radLat2)*pow(sin(($radLng1-$radLng2)/2),2)))*$R;
Where:
$ Radlat1, $ radlng1, $ radlat2, $ radlng2 is a radian
$ R indicates the Earth's radius.
2. Through testing the two algorithms, the results are the same and both are correct, but through the PHP code test, the distance between the two points, 10 million performance comparison, self-developed version of the computing duration formula B is better, as follows:
// Formula
0.56368780136108 float (431)
0.57460689544678 float (431)
0.59051203727722 float (431)
// Formula B
0.47404885292053 float (431)
0.47808718681335 float (431)
0.47946381568909 float (431)
3. The formula derived using mathematical methods is as follows:
<? PHP // calculate the distance between a ($ LAT1, $ lng1), B ($ LAT2, $ lng2), public static function getdistance ($ LAT1, $ lng1, $ LAT2, $ lng2) {// The Earth's radius $ r = 6378137; // convert the angle to $ radlat1 = deg 2rad ($ LAT1); $ radlat2 = deg 2rad ($ LAT2 ); $ radlng1 = deg 2rad ($ lng1); $ radlng2 = deg 2rad ($ lng2); // result $ S = ACOs (COS ($ radlat1) * Cos ($ radlat2) * Cos ($ radlng1-$ radlng2) + sin ($ radlat1) * sin ($ radlat2) * $ R; // precision $ S = round ($ S * 10000) /10000; return round ($ S );} ?>
4. In actual applications, operations such as matching conditions and sorting need to be retrieved from the database through traversal,
Retrieve all the data and use PHP for loop comparison to filter matching results. Obviously, the performance is low. So let's use the MySQL storage function to solve this problem.
4.1 create a MySQL storage function and index the longitude and latitude Fields
DELIMITER $$ CREATE DEFINER=`root`@`%` FUNCTION `GETDISTANCE`(lat1 DOUBLE, lng1 DOUBLE, lat2 DOUBLE, lng2 DOUBLE) RETURNS double READS SQL DATA DETERMINISTIC BEGIN DECLARE RAD DOUBLE; DECLARE EARTH_RADIUS DOUBLE DEFAULT 6378137; DECLARE radLat1 DOUBLE; DECLARE radLat2 DOUBLE; DECLARE radLng1 DOUBLE; DECLARE radLng2 DOUBLE; DECLARE s DOUBLE; SET RAD = PI() / 180.0; SET radLat1 = lat1 * RAD; SET radLat2 = lat2 * RAD; SET radLng1 = lng1 * RAD; SET radLng2 = lng2 * RAD; SET s = ACOS(COS(radLat1)*COS(radLat2)*COS(radLng1-radLng2)+SIN(radLat1)*SIN(radLat2))*EARTH_RADIUS; SET s = ROUND(s * 10000) / 10000; RETURN s; END$$ DELIMITER ;
4.2 query SQL
Through SQL, you can set the distance and sorting. You can search for information that meets the conditions and have a better sorting.
SELECT *,latitude,longitude,GETDISTANCE(latitude,longitude,30.663262,104.071619) AS distance FROM mb_shop_ext where 1 HAVING distance<1000 ORDER BY distance ASC LIMIT 0,10
Ii. Solution B
========================================================== ========================================================== ========================
Geohash algorithm; geohash is an address encoding algorithm that encodes two-dimensional longitude and latitude into one-dimensional strings.
For example, the Code for Chengdu Yongfeng interchange is wm3yr31d2524.
Advantages:
1. Use a field to store the longitude and latitude. Only one index is required for search, which is highly efficient.
2. the prefix of the encoding can represent a larger area. It is very convenient to find nearby areas. In SQL, like 'wm3yr3% 'to query all nearby locations.
3. Fuzzy coordinates and privacy protection can be achieved through encoding accuracy.
Disadvantage: the distance and sorting operations must be performed twice (the filtering results are running very fast)
1. geohash Encoding Algorithm
Longitude and latitude of Chengdu Yongfeng Interchange (30.63578, 104.031601)
1.1. The latitude range (-90, 90) is divided into two intervals (-90, 0) and (0, 90). If the target latitude is located in the first interval, It is encoded as 0, otherwise, the code is 1.
Because 30.625265 belongs to (0, 90), the encoding is 1.
Then divide (0, 90) into (0, 45), (45, 90) two intervals, and 39.92324 is located in (0, 45), so the encoding is 0,
Then divide (0, 45) into (0, 22.5), (22.5, 45) two intervals, and 39.92324 is located in (22.5, 45), so the code is 1,
Similarly, the Yongfeng interchange latitude code is 101010111001001000100101101010.
1.2. The longitude is subdivided by (-180,180) using the same algorithm. (-180,180) and (110010011111101001100000000000) are encoded.
1.3. Merge the latitude and longitude codes. from high to low, take one longitude first, and then one latitude. The result is 111001001100011111101011100011000010110000010001010001000100.
1.4. Use the 32 Letters 0-9 and B-z (remove a, I, L, O) for base32 encoding, and the code (30.63578, 104.031601) is wm3yr31d2524.
11100 10011 00011 11110 10111 00011 00001 01100 00010 00101 00010 => wm3yr31d2524 decimal 0 1 2 3 5 6 7 8 9 10 11 12 13 14 15base32 0 1 2 3 4 5 6 7 8 9 B c d E F G decimal 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31base32 H J K m N P Q R S T U v w x y Z
2. Policy
1. When the latitude and longitude are stored in the database, a new geohash field is added to the database to record the geohash value of this point.
2. Search nearby and use like 'wm3yr3% 'in the SQL statement. The result can be cached. in the residential area, the database query is not performed because the latitude and longitude are changed.
3. If you need to specify the distance or sort, you can use the distance formula and two-dimensional data to sort the results. In this case, a small amount of data will be generated quickly.
3. php base class
Geohash. Class. php
<?php /*** Encode and decode geohashes**/ class Geohash{ private $coding="0123456789bcdefghjkmnpqrstuvwxyz"; private $codingMap=array(); public function Geohash() { for($i=0; $i<32; $i++) { $this->codingMap[substr($this->coding,$i,1)]=str_pad(decbin($i), 5, "0", STR_PAD_LEFT); } } public function decode($hash) { $binary=""; $hl=strlen($hash); for($i=0; $i<$hl; $i++) { $binary.=$this->codingMap[substr($hash,$i,1)]; } $bl=strlen($binary); $blat=""; $blong=""; for ($i=0; $i<$bl; $i++) { if ($i%2) $blat=$blat.substr($binary,$i,1); else $blong=$blong.substr($binary,$i,1); } $lat=$this->binDecode($blat,-90,90); $long=$this->binDecode($blong,-180,180); $latErr=$this->calcError(strlen($blat),-90,90); $longErr=$this->calcError(strlen($blong),-180,180); $latPlaces=max(1, -round(log10($latErr))) - 1; $longPlaces=max(1, -round(log10($longErr))) - 1; $lat=round($lat, $latPlaces); $long=round($long, $longPlaces); return array($lat,$long); } public function encode($lat,$long) { $plat=$this->precision($lat); $latbits=1; $err=45; while($err>$plat) { $latbits++; $err/=2; } $plong=$this->precision($long); $longbits=1; $err=90; while($err>$plong) { $longbits++; $err/=2; } $bits=max($latbits,$longbits); $longbits=$bits; $latbits=$bits; $addlong=1; while (($longbits+$latbits)%5 != 0) { $longbits+=$addlong; $latbits+=!$addlong; $addlong=!$addlong; } $blat=$this->binEncode($lat,-90,90, $latbits); $blong=$this->binEncode($long,-180,180,$longbits); $binary=""; $uselong=1; while (strlen($blat)+strlen($blong)) { if ($uselong) { $binary=$binary.substr($blong,0,1); $blong=substr($blong,1); } else { $binary=$binary.substr($blat,0,1); $blat=substr($blat,1); } $uselong=!$uselong; } $hash=""; for ($i=0; $i<strlen($binary); $i+=5) { $n=bindec(substr($binary,$i,5)); $hash=$hash.$this->coding[$n]; } return $hash; } private function calcError($bits,$min,$max) { $err=($max-$min)/2; while ($bits--) $err/=2; return $err; } private function precision($number) { $precision=0; $pt=strpos($number,'.'); if ($pt!==false) { $precision=-(strlen($number)-$pt-1); } return pow(10,$precision)/2; } private function binEncode($number, $min, $max, $bitcount) { if ($bitcount==0) return ""; $mid=($min+$max)/2; if ($number>$mid) return "1".$this->binEncode($number, $mid, $max,$bitcount-1); else return "0".$this->binEncode($number, $min, $mid,$bitcount-1); } private function binDecode($binary, $min, $max) { $mid=($min+$max)/2; if (strlen($binary)==0) return $mid; $bit=substr($binary,0,1); $binary=substr($binary,1); if ($bit==1) return $this->binDecode($binary, $mid, $max); else return $this->binDecode($binary, $min, $mid); }} ?>
Iii. Test
<? PHP require_once ('mysql. class. PHP '); require_once ('geohash. class. PHP '); // MySQL $ conf = array ('host' =>' 127. 0.0.1 ', 'Port' => 3306, 'user' => 'root', 'Password' => '000000', 'database' => 'mocube ', 'charset' => 'utf8', 'persistent' => false); $ mysql = new db_mysql ($ conf); $ geohash = new geohash; // converts longitude and latitude to geohash/* $ SQL = 'select shop_id, latitude, longpolling from mb_shop_ext '; $ DATA = $ mysql-> queryall ($ SQ L); foreach ($ data as $ Val) {$ geohash_val = $ geohash-> encode ($ Val ['latitude '], $ Val ['longyun']); $ SQL = 'Update mb_shop_ext set geohash = "'. $ geohash_val. '"where shop_id = '. $ Val ['shop _ id']; echo $ SQL; $ Re = $ mysql-> query ($ SQL); var_dump ($ re );} * /// get nearby Information $ n_latitude =$ _ Get ['La ']; $ n_longpolling =$ _ Get ['lo']; // start $ B _time = microtime (true); // solution a, directly use the database storage function to traverse the Sorting/* $ SQL = 'select *, latitude, longitud E, getdistance (latitude, longdistance ,'. $ n_latitude. ','. $ n_longpolling. ') as distance from mb_shop_ext where 1 having distance <1000 order by distance ASC'; $ DATA = $ mysql-> queryall ($ SQL ); // end $ e_time = microtime (true); echo $ e_time-$ B _time; var_dump ($ data); exit; * // solution B geohash nearby, then sort // The current geohash value $ n_geohash = $ geohash-> encode ($ n_latitude, $ n_longpolling); // nearby $ n =$ _ Get ['n']; $ like_geohash = substr ($ N_G Eohash, 0, $ n); $ SQL = 'select * From mb_shop_ext where geohash like "'. $ like_geohash. '% "'; echo $ SQL; $ DATA = $ mysql-> queryall ($ SQL); // calculate the actual distance from foreach ($ data as $ key => $ Val) {$ distance = getdistance ($ n_latitude, $ n_longpolling, $ Val ['latitude '], $ Val ['longyun']); $ data [$ key] ['distance '] = $ distance; // sorting column $ sortdistance [$ key] = $ distance;} // array_multisort ($ sortdistance, sort_asc, $ data); // end $ e_time = Microtime (true); echo $ e_time-$ B _time; var_dump ($ data); // calculate the distance between a ($ LAT1, $ lng1), B ($ LAT2, $ lng2) function getdistance ($ LAT1, $ lng1, $ LAT2, $ lng2) {// Earth radius $ r = 6378137; // convert the angle to $ radlat1 = deg 2rad ($ LAT1); $ radlat2 = deg 2rad ($ LAT2); $ radlng1 = deg 2rad ($ lng1 ); $ radlng2 = deg 2rad ($ lng2); // result $ S = ACOs ($ radlat1) * Cos ($ radlat2) * Cos ($ radlng1-$ radlng2) + sin ($ radlat1) * sin ($ radlat2) * $ R; // accuracy $ S = r Ound ($ S * 10000)/10000; return round ($ s) ;}?>
Iv. Summary
Solution B highlights the following:
1. The search results can be cached and reused, so that the user will not directly penetrate the database query because of small-scale movement.
2. First narrowing down the result range and then performing operations and sorting can improve the performance.
254 records, performance comparison,
In practical application scenarios, solution B database search can be cached in memory. If the data volume is large, solution B has better results.
Solution:
0.016560077667236
0.032402992248535
0.040318012237549
Solution B
0.0079810619354248
0.0079669952392578
0.0064868927001953
V. Others
The two solutions can be reasonably selected based on application scenarios and load conditions. Of course, solution B is recommended;
In either case, you can remember to add an index to the column to facilitate database retrieval.