Find nearby points--geohash scenario discussion

Source: Internet
Author: User
Tags acos

Reprinted from: http://blog.csdn.net/wangliqiang1014/article/details/9143825

With the popularity of mobile terminals, many applications are based on the LBS function, the vicinity of xxx (restaurants, banks, sister paper, etc.).

In the basic data, the latitude and longitude of the target location is generally saved, and the latitude and longitude of the user provided is used to compare it to get it nearby.

Goal:
Find nearby xxx, returning results from near to far, and the result has a distance from the target point.

For the search of nearby xxx, proposed two scenarios, as follows:

I. Programme A:
=============================================================================

The abstract is the calculation of the distance of the spherical two points, that is, the latitude and longitude of two points on the spherical surface;

Point (latitude, longitude), A ($radLat 1, $radLng 1), B ($radLat 2, $radLng 2);

Advantages: Easy to understand, simple and convenient to deploy

Cons: Database is queried every time, performance is worrying

1. Derivation

Through the cosine theorem and the Radian calculation method, the final deduced formula A is:

$s = acos(cos($radLat1)*cos($radLat2)*cos($radLng1-$radLng2)+sin($radLat1)*sin($radLat2))*$R;

At present, most of the online use of Google public Distance computing company, the derivation Formula B is:

$s = 2*asin(sqrt(pow(sin(($radLat1-$radLat2)/2),2)+cos($radLat1)*cos($radLat2)*pow(sin(($radLng1-$radLng2)/2),2)))*$R;

which
$radLat 1, $radLng 1, $radLat 2, $radLng 2 radians

$R is the radius of the Earth

2, by testing the two algorithms, the results are the same and all correct, but through the PHP code test, two points between the distance, 10W performance comparison, self-deduction version calculation of the long formula B is better, as follows:

Formula A
0.56368780136108float (431)
0.57460689544678float (431)
0.59051203727722float (431)

Formula B
0.47404885292053float (431)
0.47808718681335float (431)
0.47946381568909float (431)

3, so the formula deduced by the mathematical method:

<?php//根据经纬度计算距离其中A($lat1,$lng1)、B($lat2,$lng2)public staticfunctiongetDistance($lat1,$lng1,$lat2,$lng2){//地球半径$R = 6378137;//将角度转为狐度$radLat1 = deg2rad($lat1);$radLat2 = deg2rad($lat2);$radLng1 = deg2rad($lng1);$radLng2 = deg2rad($lng2);//结果$s = acos(cos($radLat1)*cos($radLat2)*cos($radLng1-$radLng2)+sin($radLat1)*sin($radLat2))*$R;//精度$s = round($s* 10000)/10000;returnround($s);}?>

4, in the actual application, need to traverse from the database to check out the matching conditions, and sorting operations,

By taking all the data out and then comparing it with the PHP loop, the filter matches the conditional result and obviously the performance is low; so we use the next MySQL storage function to solve this problem.

4.1. Create a MySQL storage function and index the Latitude field

<?php//根据经纬度计算距离其中A($lat1,$lng1)、B($lat2,$lng2)public staticfunctiongetDistance($lat1,$lng1,$lat2,$lng2){//地球半径$R = 6378137;//将角度转为狐度$radLat1 = deg2rad($lat1);$radLat2 = deg2rad($lat2);$radLng1 = deg2rad($lng1);$radLng2 = deg2rad($lng2);//结果$s = acos(cos($radLat1)*cos($radLat2)*cos($radLng1-$radLng2)+sin($radLat1)*sin($radLat2))*$R;//精度$s = round($s* 10000)/10000;returnround($s);}?>

4.2. Query SQL

With SQL, you can set distance and sort, search for eligible information, and have a better sort

SELECT *,latitude,longitude,GETDISTANCE(latitude,longitude,30.663262,104.071619) AS distance FROM mb_shop_ext where 1 HAVING distance<1000 ORDER BY distance ASC LIMIT 0,10

Ii. Programme B (Geohash)
================================================================================

Geohash algorithm; Geohash is an address code that encodes two-dimensional latitude and longitude into one-dimensional strings.

For example: Chengdu Yongfeng Interchange Code is wm3yr31d2524

Advantages:

1, the use of a field, you can store latitude and longitude; search, only one index, high efficiency
2, the encoding prefix can represent a larger area, find nearby, very convenient. In SQL, like ' wm3yr3% ', you can query all nearby locations.
3, through the coding accuracy can be blurred coordinates, privacy protection and so on.

Cons: Distance and sort require two operations (filter results run, actually quite fast)

1, Geohash coding algorithm

Latitude and longitude of Chengdu Yongfeng Interchange (30.63578,104.031601)

1.1, the Latitude range (-90, 90) is divided into two intervals (-90, 0), (0, 90), if the target latitude is in the previous interval, the code is 0, otherwise the encoding is 1.
Since 30.625265 belongs to (0, 90), take the code to 1.
Then (0, 90) is divided into (0, 45), (45, 90) two intervals, and 39.92324 is (0, 45), so the code is 0,
Then (0, 45) is divided into (0, 22.5), (22.5, 45) two intervals, and 39.92324 is (22.5, 45), so the code is 1,
The latitude code of Yongfeng Interchange is 101010111001001000100101101010.

1.2, the longitude also uses the same algorithm, pairs (-180, 180) subdivide, ( -180,0), (0,180) obtains the code 110010011111101001100000000000

1.3, the combination of latitude and longitude code, from high to low, first take a longitude, then take a latitude; results 111001001100011111101011100011000010110000010001010001000100

1.4, with 0-9, b-z (remove A, I, L, O) These 32 letters are BASE32 encoded, get (30.63578,104.031601) encoded as wm3yr31d2524.

11100 10011 00011 11110 10111 00011 00001 01100 00010 00101 00010 00100 => wm3yr31d2524十进制 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15base32 0 1 2 3 4 5 6 7 8 9 b c d e f g 十进制 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31base32 h j k m n p q r s t u vw x y z

2. Strategy

1, in the latitude and longitude storage, the database new Add a field Geohash, record this point Geohash value

2. Find nearby, use like ' wm3yr3% ' in SQL, and this result can be cached; in small areas, the database query will not be re-queried because of the latitude and longitude changes.

3, find out the limited results, such as the need for distance or sorting, can use distance formula and two-dimensional data sorting; This is also a small amount of data, will be very fast.

3. PHP base class

geohash.class.php

<?phpclass Geohash{private $coding="0123456789bcdefghjkmnpqrstuvwxyz";private $codingMap=array();publicfunctionGeohash(){for($i=0; $i<32; $i++){$this->codingMap[substr($this->coding,$i,1)]=str_pad(decbin($i), 5,"0", STR_PAD_LEFT);}}publicfunctiondecode($hash){$binary="";$hl=strlen($hash);for($i=0; $i<$hl; $i++){$binary.=$this->codingMap[substr($hash,$i,1)];}$bl=strlen($binary);$blat="";$blong="";for($i=0; $i<$bl; $i++){if ($i%2)$blat=$blat.substr($binary,$i,1);else$blong=$blong.substr($binary,$i,1);}$lat=$this->binDecode($blat,-90,90);$long=$this->binDecode($blong,-180,180);$latErr=$this->calcError(strlen($blat),-90,90);$longErr=$this->calcError(strlen($blong),-180,180);$latPlaces=max(1, -round(log10($latErr))) - 1;$longPlaces=max(1, -round(log10($longErr))) - 1;$lat=round($lat, $latPlaces);$long=round($long, $longPlaces);returnarray($lat,$long);}publicfunctionencode($lat,$long){$plat=$this->precision($lat);$latbits=1;$err=45;while($err>$plat){$latbits++;$err/=2;}$plong=$this->precision($long);$longbits=1;$err=90;while($err>$plong){$longbits++;$err/=2;}$bits=max($latbits,$longbits);$longbits=$bits;$latbits=$bits;$addlong=1;while(($longbits+$latbits)%5 != 0){$longbits+=$addlong;$latbits+=!$addlong;$addlong=!$addlong;}$blat=$this->binEncode($lat,-90,90, $latbits);$blong=$this->binEncode($long,-180,180,$longbits); $binary="";$uselong=1;while(strlen($blat)+strlen($blong)){if($uselong){$binary=$binary.substr($blong,0,1);$blong=substr($blong,1);}else{$binary=$binary.substr($blat,0,1);$blat=substr($blat,1);}$uselong=!$uselong;}$hash="";for($i=0; $i<strlen($binary); $i+=5){$n=bindec(substr($binary,$i,5));$hash=$hash.$this->coding[$n];}return$hash;}privatefunctioncalcError($bits,$min,$max){$err=($max-$min)/2;while($bits--)$err/=2;return $err;}privatefunctionprecision($number){$precision=0;$pt=strpos($number,‘.‘);if($pt!==false){$precision=-(strlen($number)-$pt-1);}returnpow(10,$precision)/2;}privatefunctionbinEncode($number, $min, $max, $bitcount){if($bitcount==0)return"";$mid=($min+$max)/2;if ($number>$mid)return"1".$this->binEncode($number, $mid, $max,$bitcount-1);elsereturn"0".$this->binEncode($number, $min, $mid,$bitcount-1);}privatefunctionbinDecode($binary, $min, $max){$mid=($min+$max)/2;if(strlen($binary)==0)return$mid;$bit=substr($binary,0,1);$binary=substr($binary,1);if($bit==1)return$this->binDecode($binary, $mid, $max);elsereturn $this->binDecode($binary, $min, $mid);}}?>

Third, testing

<?phprequire_once(‘Mysql.class.php‘);require_once(‘geohash.class.php‘);//mysql$conf = array(‘host‘=>‘127.0.0.1‘,‘port‘=> 3306,‘user‘ =>‘root‘,‘password‘=>‘123456‘,‘database‘=>‘mocube‘,‘charset‘=>‘utf8‘,‘persistent‘=>false);$mysql = new Db_Mysql($conf);$geohash=new Geohash;//经纬度转换成Geohash//获取附近的信息$n_latitude = $_GET[‘la‘];$n_longitude = $_GET[‘lo‘];//开始$b_time = microtime(true);//方案A,直接利用数据库存储函数,遍历排序//方案B geohash求出附近,然后排序//当前 geohash值$n_geohash = $geohash->encode($n_latitude,$n_longitude); //附近$n = $_GET[‘n‘];$like_geohash = substr($n_geohash, 0, $n);$sql =‘select * from mb_shop_ext where geohash like "‘.$like_geohash.‘%"‘;echo$sql;$data = $mysql->queryAll($sql);//算出实际距离foreach($data as $key=>$val){$distance = getDistance($n_latitude,$n_longitude,$val[‘latitude‘],$val[‘longitude‘]);$data[$key][‘distance‘] = $distance;//排序列$sortdistance[$key] = $distance;}//距离排序array_multisort($sortdistance,SORT_ASC,$data);//结束$e_time = microtime(true);echo$e_time - $b_time;var_dump($data); //根据经纬度计算距离其中A($lat1,$lng1)、B($lat2,$lng2)functiongetDistance($lat1,$lng1,$lat2,$lng2){//地球半径$R = 6378137;//将角度转为狐度$radLat1 = deg2rad($lat1);$radLat2 = deg2rad($lat2);$radLng1 = deg2rad($lng1);$radLng2 = deg2rad($lng2);//结果$s = acos(cos($radLat1)*cos($radLat2)*cos($radLng1-$radLng2)+sin($radLat1)*sin($radLat2))*$R;//精度$s = round($s* 10000)/10000;returnround($s);}?>

Iv. Summary

The highlights of Plan B are:
1, the search results can be cached, re-use, not because the user has a small range of mobile, directly through the database query.
2, reduce the result range, and then calculate, sort, can improve performance.

254 Records, performance comparison,

In a real-world scenario, the scenario B database searches for a memory cache, and if the amount of data is larger, scenario B results better.

Scenario A:
0.016560077667236
0.032402992248535
0.040318012237549

Scenario B
0.0079810619354248
0.0079669952392578
0.0064868927001953

V. Other

The two schemes, according to the application scenario and the load situation reasonable choice, of course recommended program B;
Either way, remember to index the columns to facilitate database retrieval.

Find nearby points--geohash scenario discussion

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.