With the advent of LBS applications, it is particularly important to implement location-based search in databases.
First, design a simple data table to store latitude and longitude information:
CREATE TABLE `index` ( `id` int(11) NOT NULL AUTO_INCREMENT, `lat` double NOT NULL, `lng` double NOT NULL, PRIMARY KEY (`id`)) ENGINE=MyISAM DEFAULT CHARSET=utf8;
After the creation is complete, we can check whether it looks like this.
mysql> desc `index`;+-------+---------+------+-----+---------+----------------+| Field | Type | Null | Key | Default | Extra |+-------+---------+------+-----+---------+----------------+| id | int(11) | NO | PRI | NULL | auto_increment || lat | double | NO | | NULL | || lng | double | NO | | NULL | |+-------+---------+------+-----+---------+----------------+3 rows in set (0.00 sec)
Next, let's make some data, so we can easily wait for the test and write a python script to implement it:
import MySQLdbimport randomtry: conn=MySQLdb.connect(host='localhost',user='eslizn',passwd='123456',db='geo',port=3306) cur=conn.cursor() for i in range(2000000): lat = random.randint(-9000000,9000000)/100000.0 lng = random.randint(-18000000,18000000)/100000.0 sql = "insert into `index` (`lat`,`lng`) values (%f,%f)" % (lat,lng) cur.execute(sql) print "[%d]%s" % (i,sql) cur.close() conn.close()except MySQLdb.Error,e: print "Mysql Error %d: %s" % (e.args[0], e.args[1])
In order to easily test the effects of adding indexes and not adding indexes, you also need to copy a table for comparison:
mysql> create table unindex select * from `index`;Query OK, 2000838 rows affected (0.93 sec)Records: 2000838 Duplicates: 0 Warnings: 0
Set a B-tree index for the lat and lng fields of the index table:
mysql> ALTER TABLE `index` ADD INDEX `lat_lng` USING BTREE (`lat`, `lng`) ;Query OK, 2000838 rows affected (10.94 sec)Records: 2000838 Duplicates: 0 Warnings: 0
The distance is calculated based on the longitude and latitude of two points. However, after all, the figure is taken directly from the plane. this is a big mistake. first, although the latitude is converted to a distance multiplied by a constant, the Longitude Distance is calculated using the trigonometric function. the formula is as follows:
R = earth’s radiusΔlat = lat2 lat1Δlng = lng2 lng1a = sin(Δlat/2) + cos(lat1) * cos(lat2) * sin(Δlng/2)c = 2*atan2(√a, √(1a))dist = R*c
Compile an SQL query statement based on the formula:
mysql> set @er=6366.564864;#earth’s radius (km)Query OK, 0 rows affected (0.00 sec)mysql> set @lat=56.14262; #Search origin latQuery OK, 0 rows affected (0.00 sec)mysql> set @lng=37.605853; #Search origin lngQuery OK, 0 rows affected (0.00 sec)mysql> set @dist=20;#Search radius (km)Query OK, 0 rows affected (0.00 sec)mysql> SELECT id,lat,lng,@er*2*ASIN(SQRT(POWER(SIN((@lat - lat)*pi()/180 / 2), 2) + COS(@lat * pi()/180) * COS(lat * pi()/180) * POWER(SIN((@lng - lng) * pi()/180 / 2), 2) )) as dist FROM `unindex` having dist < @dist ORDER BY dist;+---------+----------+----------+------------------+| id | lat | lng | dist |+---------+----------+----------+------------------+| 1618442 | 56.12129 | 37.51233 | 6.25753112752837 || 53613 | 56.05718 | 37.70809 | 11.4140654631309 || 1485350 | 56.24562 | 37.68273 | 12.392725454166 || 757733 | 56.09484 | 37.418 | 12.7905134964855 || 1657748 | 56.15971 | 37.38095 | 14.0488218629237 || 481209 | 56.2635 | 37.40645 | 18.2296307623964 |+---------+----------+----------+------------------+6 rows in set (2.17 sec)
Although the query is implemented, the time is really painful (because no conditions are set, mysql performs a table scan, about 2 million records, you say it does not hurt ). therefore, you must modify the idea and enclose the approximate range for query.
First, we need to calculate the latitude and longitude range. due to the existence of the longitude bitch, we have to calculate the trigonometric function:
set @lat=56.14262;set @lng=37.605853;set @dist=20;#kmset @lat_length=20003.93/180;#lat lengthset @lat_left=@lat-(@dist/@lat_length);set @lat_right=@lat+(@dist/@lat_length);set @lng_left=@lng-@dist/abs(cos(radians(@lat))*@lat_length);set @lng_right=@lng+@dist/abs(cos(radians(@lat))*@lat_length);
Query:
mysql> set @er=6366.564864;#kmQuery OK, 0 rows affected (0.00 sec)mysql> set @lat=56.14262;Query OK, 0 rows affected (0.00 sec)mysql> set @lng=37.605853;Query OK, 0 rows affected (0.00 sec)mysql> set @dist=20;#kmQuery OK, 0 rows affected (0.00 sec)mysql> set @lat_length=20003.93/180;#lat lengthQuery OK, 0 rows affected (0.00 sec)mysql> set @lat_left=@lat-(@dist/@lat_length);Query OK, 0 rows affected (0.00 sec)mysql> set @lat_right=@lat+(@dist/@lat_length);Query OK, 0 rows affected (0.00 sec)mysql> set @lng_left=@lng-@dist/abs(cos(radians(@lat))*@lat_length);Query OK, 0 rows affected (0.00 sec)mysql> set @lng_right=@lng+@dist/abs(cos(radians(@lat))*@lat_length);Query OK, 0 rows affected (0.00 sec)mysql> SELECT id,lat,lng,@er*2*ASIN(SQRT(POWER(SIN((@lat - lat)*pi()/180 / 2), 2) + COS(@lat * pi()/180) * COS(lat * pi()/180) * POWER(SIN((@lng - lng) * pi()/180 / 2), 2) )) as dist FROM `unindex` WHERE lat BETWEEN @lat_left AND @lat_right AND lng BETWEEN @lng_left AND @lng_right having dist < @dist ORDER BY dist;+---------+----------+----------+------------------+| id | lat | lng | dist |+---------+----------+----------+------------------+| 1618442 | 56.12129 | 37.51233 | 6.25753112752837 || 53613 | 56.05718 | 37.70809 | 11.4140654631309 || 1485350 | 56.24562 | 37.68273 | 12.392725454166 || 757733 | 56.09484 | 37.418 | 12.7905134964855 || 1657748 | 56.15971 | 37.38095 | 14.0488218629237 || 481209 | 56.2635 | 37.40645 | 18.2296307623964 |+---------+----------+----------+------------------+6 rows in set (0.30 sec)
The results show that the query results are greatly improved, but in fact we can optimize them because we are currently operating on a data table without an index, next we will use the data table that has created an index to see the effect:
mysql> set @er=6366.564864;#kmQuery OK, 0 rows affected (0.00 sec)mysql> set @lat=56.14262;Query OK, 0 rows affected (0.00 sec)mysql> set @lng=37.605853;Query OK, 0 rows affected (0.00 sec)mysql> set @dist=20;#kmQuery OK, 0 rows affected (0.00 sec)mysql> set @lat_length=20003.93/180;#lat lengthQuery OK, 0 rows affected (0.00 sec)mysql> set @lat_left=@lat-(@dist/@lat_length);Query OK, 0 rows affected (0.00 sec)mysql> set @lat_right=@lat+(@dist/@lat_length);Query OK, 0 rows affected (0.00 sec)mysql> set @lng_left=@lng-@dist/abs(cos(radians(@lat))*@lat_length);Query OK, 0 rows affected (0.00 sec)mysql> set @lng_right=@lng+@dist/abs(cos(radians(@lat))*@lat_length);Query OK, 0 rows affected (0.00 sec)mysql>mysql> SELECT id,lat,lng,@er*2*ASIN(SQRT(POWER(SIN((@lat - lat)*pi()/180 / 2), 2) + COS(@lat * pi()/180) * COS(lat * pi()/180) * POWER(SIN((@lng - lng) * pi()/180 / 2), 2) )) as dist FROM `index` WHERE lat BETWEEN @lat_left AND @lat_right AND lng BETWEEN @lng_left AND @lng_right having dist < @dist ORDER BY dist;+---------+----------+----------+------------------+| id | lat | lng | dist |+---------+----------+----------+------------------+| 1618442 | 56.12129 | 37.51233 | 6.25753112752837 || 53613 | 56.05718 | 37.70809 | 11.4140654631309 || 1485350 | 56.24562 | 37.68273 | 12.392725454166 || 757733 | 56.09484 | 37.418 | 12.7905134964855 || 1657748 | 56.15971 | 37.38095 | 14.0488218629237 || 481209 | 56.2635 | 37.40645 | 18.2296307623964 |+---------+----------+----------+------------------+6 rows in set (0.04 sec)
So far, we have implemented a similar "view people nearby" function.