Quick Find test for tens above map points of interest (POI)

Source: Internet
Author: User
Tags redis

Recently, finally a little time, the previous map point of Interest Crawler (http://blog.csdn.net/sparkexpert/article/details/51554813) perfected the next, And spent seven days crawling through all the categories of POI data covering any part of the country.


Data download is still a difficult process, but fortunately, after the adoption of new methods, there is little need for manual intervention, of course, there will be network restrictions, but basically at the same time to open up 5 download channels, the speed has been swish.


After the download is complete, because there is no direct processing, just download the JSON-formatted text data, accounting for more than 60G of disk space. The number of POI totals is tens of millions of. (Some of the current out-of-range download is not complete, the main reason is the site server restrictions, only allow the download of the first few pages.) Of course, there are ways to deal with this, but it is not too much to care about these. )


Then after downloading the data, just wondering how to quickly find data, such as Baidu Google and other map sites loading data is very fast, then how to achieve this step. This article is mainly to solve this problem.


The Redis hash is then used to store each point of interest. The process will filter out some duplicate key values, but this filter is conditional, such as newsstand, there may be many of the same key values, but they are independent, you need to add a 0,1,2 in these suffixes, ...



Attach a number that is found on the Redis client. This picture is just a random cut in the import process. In fact, it's already a good tens of millions of.



In order to verify the efficiency of the query, a lookup is made to find a particular category of data for a city directly, as shown in the figure:


It can be said that using Redis to do the map POI cache, the speed is very fast. In the status bar below, the time taken by the query is displayed in real time, and the results are returned in very little time.


In order to better test the search for all information, such as without restricting the city and not restricting the category, the results are as follows:



The total test time was found to be more than 5 seconds, which is still a test on a very common PC. If you really want to do a map server, you can use large memory high-performance server, can do a millisecond response.


However, there is a bad place to use this sort of database, that is, there is no way to do a nearby POI query directly above. (There is no way to do this on the key value.) This estimate also depends on the previous approach to grid index to achieve it. (But if you limit it to a certain city, you can certainly do it with Redis, but it's still very efficient to make queries on small data)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.