Is it useful and cool to analyze the user's address through the Web Access log and then place it on the map and analyze the hot zones that are accessing the source to get the map distribution? Here is an example of geo-targeting using Giscript and GeoIP to access URLs.
Although this function looks simple, it has to be divided into many parts. Detailed below:
1, the first is to obtain an IP address, this is not much to say. It is available in the requestheaders of the Web server, or it can be extracted from the log. Extracting from a file can be processed in batches, and a high-efficiency storage system, such as a message bus or NoSQL, can be processed in real-time by extracting it from the access information.
2, use GeoIP to speak the domain name or IP address resolution to place name. GeoIP is an IP-to-address parsing tool developed by Maxmind, including software and IP databases. The free version of the positioning is thicker, and the paid version can achieve higher precision positioning.
3. The conversion of place names through geocoding (Geocoding) functions to spatial coordinates or spatial geometries requires the co-support of software and map data. This used to be a professional GIS function, now google/Baidu has provided online service interface. But because there are a lot of restrictions on the online (network bandwidth, concurrency control, account number, etc.), here is the tool to use Giscript. Because there are python libraries with GeoIP, it's easy to integrate them together. If a large amount of processing is required, then it is OK to assume that a celery is processed in parallel.
4, statistical data in the spatial database sample frequency and other properties, generate thematic map or intermediate results, you can export the map or transfer data to r continue to perform advanced analysis, generating statistical charts.
5, R analysis results can be used giscript back to the spatial database, further advanced thematic map production.
This process can be used in many application scenarios, such as user analysis, anti-fraud, search analysis, market analysis, and so on.
After the specific strategy study, not to be continued.
Geo-location analysis of access URLs based on Giscript and GeoIP