PHP collects statistics on the path of the 404 link page captured by the search engine in the nginx access log. I have the habit of cutting nginx logs on the server every day. Therefore, for visits from various search engines every day, I can always record some 404 page information. Traditionally, I only occasionally analyze the logs, however, I have the habit of cutting nginx logs every day on the server. Therefore, for visits from various search engines every day, I can always record some 404 page information. Traditionally, I only occasionally analyze the logs, however, it may not be easy for a lot of log information users to manually filter logs, 360 access to search engines such as Google, Baidu, sousearch, 404 search, yisearch, Sogou, and Bing is generated as a txt text file and the code test is directly carried on. php.
The code is as follows:
<? Php
// Access test. php? S = google
$ Domain = 'http: // www.jb51.net ';
$ Spiders = array ('baidu' => 'baidider Ider ', '000000' => '360spider ',
'Google '=> 'googlebot', 'sososo' => 'sosospider ', 'sogou' =>
'Sogou web spider ', 'easou' => 'easouspider ', 'Bing' => 'bingbot ');
$ Path = '/home/nginx/logs/'. date ('Y/m/'). (date ('D')-1).'/access_www.txt ';
$ S = $ _ GET ['s '];
If (! Array_key_exists ($ s, $ spiders) die ();
$ Spider = $ spiders [$ s];
When file1_1_s.'_'.date('ym'{.(date('d'{-1}.'.txt ';
If (! File_exists ($ file )){
$ In = file_get_contents ($ path );
$ Pattern = '/GET (. *) HTTP \/1.1 "404. *'. $ spider .'/';
Preg_match_all ($ pattern, $ in, $ matches );
$ Out = '';
Foreach ($ matches [1] as $ k => $ v ){
$ Out. = $ domain. $ v. "\ r \ n ";
}
File_put_contents ($ file, $ out );
}
$ Url = $ domain. '/silian/'. $ file;
Echo $ url;
Okay. There is no advanced technology, and there is only a hands-on writing process.
...