A start to build the Chinese blog Alliance, both bloggers remind me to do Web site Daquan maintenance of this kind of website is very troublesome, need a lot of energy to debug some of the aborted site, but also to take the blog Encyclopedia of Pine for examples. Of course, I am also deeply believe that. Some time ago, see Dream Xuan Beauty boke123 Web site Daquan maintenance records, it seems to be pure manual inspection, Zhanggo is really admire admiration, too perseverance.
Now blog Federation also included in the blog has broken 200, all from the independent submission, whether you are grass Bo or bo, Zhanggo will not strong buy strong sell. Since most of the station is built, but half a year, halfway to give up, early eunuch blog estimated or some, so I decided to maintain the site to do this work.
The morning with PHP did a put to the Beijing East Cloud tried, found that the speed of general, to wait for half (I write PHP is too lame, do not shortcoming).
Subsequently, I wrote a multi-threaded site in the VPS State detection script, directly from the database load site address, and then use Curl to detect return code, found very good, basic 1 minutes will be able to produce results
Here's the scripting code:
#!/bin/bash #Author: Zhangge #Date: 2014-08-21 #Desc: Check The site of Zgboke Alliance. #取出网站数据 data= '/usr/bin/mysql-uroot-p123456-e "use Zgboke;select web_url from dir_websites where web_status= ' 3 ';"-n-b |
awk ' {print '} ' if [-Z ' $data '];then echo ' faild to connect database! ' Exit 1 fi test-f result.log && rm-f result.log function delay {sleep 3} Tmp_fifofile=/tmp/$$.fifo Mkfifo $
The tmp_fifofile exec 6<> $tmp _fifofile rm $tmp _fifofile #定义并发线程数 to be adjusted according to the VPS configuration.
Thread=100 for ((i=0;i< $thread; i++) does echo done>&6 #开始多线程循环检测 for URL in $data do read-u6 { #curl抓取网站http状态码 code= ' Curl-o/dev/null--retry 3--retry-max-time 8-s-W%{http_code} $url ' echo ' $code--- ; $url ">>result.log #判断子线程是否执行成功, and output delay && {echo" $code---> $url "} | |
{echo "Check thread error!" echo >& 6}& done #等待所有线程执行完毕 wait exec 6>&-#找出非200返回码的站点 echo List of ExceptiOn Website:cat Result.log | Grep-v Exit 0
Ps: About the shell multithreaded script, the following article will have a detailed description, this article space is limited, do not say more.
The following is the results of the first member site survival test of the China Blog Alliance:
①, non 200 return code of the exception site:
②, script-crawled inaccessible site:
Manual access to filter results:
Wangyingxue.net (wangying Learning Blog): Unable to access, confirmed in the record of √
www.tao0102.com (Changjiang blog): can visit √
blog.hack7d.com (McDull Technology blog): Unable to access X
Www.1992621.com (Teacher's Diary): can visit √
Www.3miaotu.com (Three Seconds Hare): Unable to access X
xiaoxiaomayi.com (Small ant blog): You can access √
www.awrui.com (Li Wendong blog): can access √
Ps: The script detection mechanism is: 8s is not connected to determine the exception, and try again 3 times, the final output results, if three times are abnormal 000. From the figure and manual screening can be seen, there are some manslaughter, this and 8s settings have a certain relationship. Can consider set for longer time, get more accurate results, of course, eventually to be combined with artificial confirmed, so it does not matter much.
Follow-up, the Chinese blog Union will develop a check cycle, the shortest weekly check, the longest one months to check once, for each show site can be normal access. Of course, I will also publish the results of each check in the Chinese Blog Association webmaster Information column, convenient for all members to view.
As the current Chinese blog Alliance deployed in Beijing East Cloud, unable to remotely manipulate the database, so had to temporarily use semi-automatic mode. After the time to move to the Aliyun and other VPs, the script will be changed to fully automatic state, when there is a site contact repeatedly detected as a lost state, will temporarily set it to a hidden state.