Before work, the boss gave a question, which is a recent lbs requirement.
You need to get the most recent restaurants.
Existing records: the database contains detailed records of about 1 million restaurants, including the ID, name, and latitude and longitude.
The mobile terminal returns the current latitude and longitude to you to find a nearby restaurant.
After trying it out, if you directly query the database, it is not slow. When both the longitude and latitude are indexed, fuzzy query (for example, the longitude and latitude returned by the client are 38.1024 and 62.2048 ), since it is nearby, you can query by 38.102 ------- 38.103 and 62.204 ------ 62.205, as long as the latitude and longitude match these two intervals, even the nearby restaurant.
SQL:
select id,name from tablename where jingdu>38.102 and jingdu<38.103 and weidu>62.204 and weidu<62.205;
Execution time. If no index is added at the longitude and latitude, the 1 million records will be about 0.3 seconds, and the index will be about 0.06 seconds.
However, our website has a daily PV of 3kw and more mobile terminals. If we run the database like this every time, but it is not suitable for storing the cache, after all, it is not scientific to store a group of caches in a restaurant nearby to everyone.
Therefore, cache is required. Here is my method:
The client sends the longitude and latitude. We first blur it. According to the data analysis I found on the Baidu map, the latitude and longitude are about 80 meters in the third decimal place, which is in line with "nearby ", rounding the fourth digit (for example, we can think that the fuzzy longitude of longitude 38.2356 is 38.236)
Therefore, when the client sends us a 38.1024 or 69.2048 latitude and longitude, we first round it and get the 38.102, 69.205 fuzzy latitude and longitude, and then hash the value according to the latitude and longitude, you can use MD5 or similar hash to obtain the key value and use the redis hget method.
$ Redis-> hget ('nearest ', $ key );
If not, go to MySQL for search
select id,name,jingdu,weidu from tablename where ROUND(jingdu,3) = 38.102 and ROUND(weidu,3)=69.205
The obtained data is stored in the redis hash table as an array and the data is returned.
In this way, someone will check the nearby restaurant next time, and then they will be able to retrieve it directly from the hash table of redis.
The above method does not need to consider the problem of raw data. Instead, you do not have to run all the data to put them in redis. Instead, you need to find the restaurant where they are used and find them for life. If you join the new restaurant later, blur the latitude and longitude of the restaurant, and then enter the hash table of redis for later reference.
There is another way, just talk about the train of thought. It is based on the geohash algorithm. Although geohash is an algorithm used to hash longitude and latitude, it is more suitable for maps than restaurants. The map is divided into several large rectangles to store keys. Each rectangle is divided into several small rectangles to store keys. Then, based on the longitude and latitude sent from the client, the large rectangle is matched first and then the small rectangle is matched, to find a restaurant. When we first split the map, the cost was too high and the efficiency was too low, making it useless to occupy resources (memory. Because our main data is the more than 1 million restaurants, not the more than 1 million cities, some cities may have a large number of restaurants in the southeast corner, but none of them in the northwest corner. If we follow the principle of geohash, it is necessary to maintain two keys in the Southeast and northwest corners (assuming they are far away ). The key in the northwest corner has nothing at all.
Therefore, we adopted the first method. The more people in a region, the more efficient the query. Of course, in order to improve accuracy and accuracy, we can add a layer of data in the outer or inner layer and add an outer layer, which means to first check the outer layer, to be precise to the two decimal places, first check the restaurants in the region, check the restaurants in the residential area again, with more layer keys, occupying more memory, but first check the large area to avoid having no restaurants in the residential area.
This is the general idea.
The client sends (38.1234, 68.5678) a fuzzy hash MD5 (38.123 _ 68.568). After the key is obtained, redis-> hget ('nearest ', $ key)
Check the database if no
Generate an array and insert it into the hash table. If yes, the restaurant name and ID are directly returned.
I am not deeply involved. If you have any shortcomings or errors in this article, please correct them.
How to get the nearest restaurant