What tools do you use to make statistics?
Some people may think of a relay URL, such as: Click on the ad av to jump to a ad.php, and then take some parameters, became: ad.php?id=123, and so on, and then warehousing, but such data is huge, your library's main force not to waste in this above ...
Let's see how Baidu Does it:
This is the Baidu advertising a reference page, he was through a visit to a picture, and then brought a bunch of even the sister can not understand the parameters, so that can be counted?
After testing and research in the project, the discovery is OK, and his idea is this:
Send a request (no matter how)-> then iis/nginx/will record this URL-> download/generate log file-> back-end/JS process This file-> generate the data to see (can be drawn as a chart, column what) do you believe it?
The process of implementation
Note: This process is iis8.0 environment, Nginx, Apache and other ideas
1, to open a new domain IIS log, I am here is the type of log, the specific choice of which is the same, just to see what data they want, and then pay attention to the plan there, where the selected Dongdong directly affect your log cache, specific self-test;
PS: Why use the new domain, because this log file is generated for the domain name, if you are in the old domain may generate a lot of log files, such as your www.111cn.net has 1 million resources, then this log file is very large, not as good as enabling a Click.111cn.net/log.gif to do, you might as well go to see Baidu.
2, access the URL of this domain, such as: Http://127.0.0.1/log.gif?type=jserror&uid=&ref=http%3A%2F%2Fwww.111cn.net%2fhtml% 2fxieliang.html&content=%e7%99%bb%e5%bd%95%e5%bc%82%e6%ad%a5%e6%8a%a5%e9%94%99,%e9%94%99%e8%af%af%e7%b1%bb %e5%9e%8b%e4%b8%bajson%e8%a7%a3%e6%9e%90%e5%a4%b1%e8%b4%a5&r=100000, you can have more access points, and then wait to generate the log!
3, open the log folder, (do not say you do not know), no accident will generate a such dongdong:
4, juvenile, open him to see ...
Figure: The red box is what we visit, and the green box is not what we want, but she does occupy space, which is also said above why to try new domain ...
Well, with all this data, are we still worried about the use?
5, write back-end read her, I was tested with PHP, of course you can use JS
The code is as follows |
Copy Code |
<?php Picture name of the statistic Define (' urlname ', ' log.gif '); Simulate capture log file $content = file_get_contents ("Log.txt"); Every line we find. $arr = Preg_split ('/[\n]/', $content); Within the line include the target urlname $result = Array (); foreach ($arr as $key => $value) { If this line contains a target picture if (Strpos ($value, urlname)!== FALSE) { $temp = Array (); Append one all to the forward $temp [' all '] = $value; Check Time Preg_match ('/^ (?: [\w\-]) +\s (?: [^\s]+?) \s/', $value, $temp 2); if (!empty ($temp 2)) { $temp [' time '] = $temp 2[0]; } Check the parameters Preg_match ('/'. URLName. ' \s+ (\s+?) \s/', $value, $temp 2); if (!empty ($temp 2)) { $temp [' param '] = array (); $temp 2 = Explode (' & ', $temp 2[1]); For ($i =0 $i < count ($temp 2); $i + +) { $temp 3 = explode (' = ', $temp 2[$i]); if (!empty ($temp 3)) { $temp [' param '] [$temp 3[0]] = UrlDecode ($temp 3[1]); } } } $result [] = $temp; } } echo json_encode (Array (' Data ' => $result)); |
6, look at the results.
The advantages of doing statistics on pictures
Small, fast, because this figure is only a little bit, but must exist, as well as 404, although said 404 can crawl to, but 404 itself is a bug
Without libraries, it's all about the functionality of the Web server itself, so she's strong.
Widely used, such as their own set some parameters, Type=ad, Type=jserror, type= .... That's the sort of decision to make.
Faster than back-end jumps
Front-end application is simple ... You just need new image and SRC, you know ...
Custom strong, you customize the parameters as needed.
Todo
At present I think of this, as to how very good to get the log of IIS this is still to be studied, there is how good time to paint or read ... Of course, I could think of a solution somewhere in the subway!
OK, so far, sleepy, next time to try to analyze the "Baidu Ads" Connection mode: