This article mainly creates a crawler database and uses robot. php to record the visiting crawler information so as to insert the information into the database and use php code to implement crawler record. If you need it, you can refer to it. Implement crawler record this article from the creation of a crawler database, robot. php record the access crawler so that the information is inserted into the database crawler, and then all crawler information can be obtained from the database. The implementation code is as follows:
Database Design
create table crawler ( crawler_ID bigint() unsigned not null auto_increment primary key, crawler_category varchar() not null, crawler_date datetime not null default '-- ::', crawler_url varchar() not null, crawler_IP varchar() not null)default charset=utf;
The following file robot. php records the crawler and writes the information to the database:
<? Php $ ServerName = $ _ SERVER ["SERVER_NAME"]; $ ServerPort = $ _ SERVER ["SERVER_PORT"]; $ ScriptName = $ _ SERVER ["SCRIPT_NAME"]; $ QueryString = $ _ SERVER ["QUERY_STRING"]; $ serverip = $ _ SERVER ["REMOTE_ADDR"]; $ Url = "http ://". $ ServerName; if ($ ServerPort! = "") {$ Url = $ Url. ":". $ ServerPort;} $ Url = $ Url. $ ScriptName; if ($ QueryString! = "") {$ Url = $ Url ."? ". $ QueryString;} $ GetLocationURL = $ Url; $ agent = $ _ SERVER ["HTTP_USER_AGENT"]; $ agent = strtolower ($ agent); $ Bot = ""; if (strpos ($ agent, "bot")>-) {$ Bot = "Other Crawler";} if (strpos ($ agent, "googlebot")> -) {$ Bot = "Google";} if (strpos ($ agent, "mediapartners-google")>-) {$ Bot = "Google Adsense ";} if (strpos ($ agent, "baiduspider")>-) {$ Bot = "Baidu";} if (strpos ($ agent, "sogou spider")> -) {$ Bot = "Sogou";} if (strpos ($ agent, "yahoo")>-) {$ Bot = "Yahoo! ";} If (strpos ($ agent," msn ")>-) {$ Bot =" MSN ";} if (strpos ($ agent," ia_archiver ")> -) {$ Bot = "Alexa";} if (strpos ($ agent, "iaarchiver")>-) {$ Bot = "Alexa";} if (strpos ($ agent, "sohu")>-) {$ Bot = "Sohu";} if (strpos ($ agent, "sqworm")>-) {$ Bot = "AOL ";} if (strpos ($ agent, "yodaoBot")>-) {$ Bot = "Yodao";} if (strpos ($ agent, "iaskspider")> -) {$ Bot = "Iask";} require (". /dbinfo. php "); date_defau Lt_timezone_set ('prc'); $ shijian = date ("Y-m-d h: I: s", time ()); // connect to the MySQL server $ connection = mysql_connect ($ host, $ username, $ password); if (! $ Connection) {die ('not ccted: '. mysql_error ();} // sets the active MySQL database $ db_selected = mysql_select_db ($ database, $ connection); if (! $ Db_selected) {die ('can \'t use db :'. mysql_error ();} // insert data to the database $ query = "insert into crawler (FIG, fig, fig, FIG) values ('$ Bot', '$ shijian ', '$ getlocationurl',' $ serverip') "; $ result = mysql_query ($ query); if (! $ Result) {die ('invalid query: '. mysql_error () ;}?>
Now, you can access the database to know when the spider crawls your page.
View sourceprint? <? Phpinclude '. /robot. php '; include '.. /library/page. class. php '; $ page = $ _ GET ['page']; include '.. /library/conn_new.php '; $ count = $ mysql-> num_rows ($ mysql-> query ("select * from crawler"); $ pages = new PageClass ($ count ,, $ _ GET ['page'], $ _ SERVER ['php _ SELF ']. '? Page = {page} '); $ SQL = "select * from crawler order by"; $ SQL. = "crawler_date desc limit ". $ pages-> page_limit. ",". $ pages-> myde_size; $ result = $ mysql-> query ($ SQL);?>
|
Crawler access time |
Crawler category |
Crawler IP |
Crawler URL |
<? Phpwhile ($ myrow = $ mysql-> fetch_array ($ result) {?>
|
<? Echo $ myrow ["crawler_date"]?> |
<? Echo $ myrow ["crawler_category"]?> |
<? Echo $ myrow ["crawler_IP"]?> |
<? Echo $ myrow ["crawler_url"]?> |
<? Php }?>
<? Php echo $ pages-> myde_write ();?>
The above code is the PHP code implementation crawler record-all the content of the super-managed, I hope to help you.