PHP code implementation crawler record-super-managed, php code Crawler
Implement crawler record this article from the creation of a crawler database, robot. php record the access crawler so that the information is inserted into the database crawler, and then all crawler information can be obtained from the database. The implementation code is as follows:
Database Design
create table crawler ( crawler_ID bigint() unsigned not null auto_increment primary key, crawler_category varchar() not null, crawler_date datetime not null default '-- ::', crawler_url varchar() not null, crawler_IP varchar() not null)default charset=utf;
The following file robot. php records the crawler and writes the information to the database:
<? Php $ ServerName = $ _ SERVER ["SERVER_NAME"]; $ ServerPort = $ _ SERVER ["SERVER_PORT"]; $ ScriptName = $ _ SERVER ["SCRIPT_NAME"]; $ QueryString = $ _ SERVER ["QUERY_STRING"]; $ serverip = $ _ SERVER ["REMOTE_ADDR"]; $ Url = "http ://". $ ServerName; if ($ ServerPort! = "") {$ Url = $ Url. ":". $ ServerPort;} $ Url = $ Url. $ ScriptName; if ($ QueryString! = "") {$ Url = $ Url ."? ". $ QueryString;} $ GetLocationURL = $ Url; $ agent = $ _ SERVER ["HTTP_USER_AGENT"]; $ agent = strtolower ($ agent); $ Bot = ""; if (strpos ($ agent, "bot")>-) {$ Bot = "Other Crawler";} if (strpos ($ agent, "googlebot")> -) {$ Bot = "Google";} if (strpos ($ agent, "mediapartners-google")>-) {$ Bot = "Google Adsense ";} if (strpos ($ agent, "baiduspider")>-) {$ Bot = "Baidu";} if (strpos ($ agent, "sogou spider")> -) {$ Bot = "Sogou";} if (strpos ($ agent, "yahoo")>-) {$ Bot = "Yahoo! ";} If (strpos ($ agent," msn ")>-) {$ Bot =" MSN ";} if (strpos ($ agent," ia_archiver ")> -) {$ Bot = "Alexa";} if (strpos ($ agent, "iaarchiver")>-) {$ Bot = "Alexa";} if (strpos ($ agent, "sohu")>-) {$ Bot = "Sohu";} if (strpos ($ agent, "sqworm")>-) {$ Bot = "AOL ";} if (strpos ($ agent, "yodaoBot")>-) {$ Bot = "Yodao";} if (strpos ($ agent, "iaskspider")> -) {$ Bot = "Iask";} require (". /dbinfo. php "); date_defau Lt_timezone_set ('prc'); $ shijian = date ("Y-m-d h: I: s", time ()); // connect to the MySQL server $ connection = mysql_connect ($ host, $ username, $ password); if (! $ Connection) {die ('not ccted: '. mysql_error ();} // sets the active MySQL database $ db_selected = mysql_select_db ($ database, $ connection); if (! $ Db_selected) {die ('can \'t use db :'. mysql_error ();} // insert data to the database $ query = "insert into crawler (FIG, fig, fig, FIG) values ('$ Bot', '$ shijian ', '$ getlocationurl',' $ serverip') "; $ result = mysql_query ($ query); if (! $ Result) {die ('invalid query: '. mysql_error () ;}?>
Now, you can access the database to know when the spider crawls your page.
View sourceprint? <? Phpinclude '. /robot. php '; include '.. /library/page. class. php '; $ page = $ _ GET ['page']; include '.. /library/conn_new.php '; $ count = $ mysql-> num_rows ($ mysql-> query ("select * from crawler"); $ pages = new PageClass ($ count ,, $ _ GET ['page'], $ _ SERVER ['php _ SELF ']. '? Page = {page} '); $ SQL = "select * from crawler order by"; $ SQL. = "crawler_date desc limit ". $ pages-> page_limit. ",". $ pages-> myde_size; $ result = $ mysql-> query ($ SQL);?> <Table width = ""> <thead> <tr> <td bgcolor = "# CCFFFF"> </td> <td bgcolor = "# CCFFFF" align = "center" style = "color: # "> crawler access time </td> <td bgcolor =" # CCFFFF "align =" center "style =" color: # "> crawler classification </td> <td bgcolor =" # CCFFFF "align =" center "style =" color: # "> crawler IP </td> <td bgcolor =" # CCFFFF "align =" center "style =" color: # "> crawler access URL </td> </tr> </thead> <? Phpwhile ($ myrow = $ mysql-> fetch_array ($ result) {?> <Tr> <td width = ""> </td> <td width =" "style =" font-family: Georgia "> <? Echo $ myrow ["crawler_date"]?> </Td> <td width = "" style = "color: # FA"> <? Echo $ myrow ["crawler_category"]?> </Td> <td width = ""> <? Echo $ myrow ["crawler_IP"]?> </Td> <td width = ""> <? Echo $ myrow ["crawler_url"]?> </Td> </tr> <? Php }?> </Table> <? Php echo $ pages-> myde_write ();?>
The above code is the PHP code implementation crawler record-all the content of the super-managed, I hope to help you.