PHP code implementation crawler record-super-managed, php code Crawler

Source: Internet
Author: User

PHP code implementation crawler record-super-managed, php code Crawler

Implement crawler record this article from the creation of a crawler database, robot. php record the access crawler so that the information is inserted into the database crawler, and then all crawler information can be obtained from the database. The implementation code is as follows:

Database Design

create table crawler  (   crawler_ID bigint() unsigned not null auto_increment primary key, crawler_category varchar() not null, crawler_date datetime not null default '-- ::', crawler_url varchar() not null, crawler_IP varchar() not null)default charset=utf;

The following file robot. php records the crawler and writes the information to the database:

<? Php $ ServerName = $ _ SERVER ["SERVER_NAME"]; $ ServerPort = $ _ SERVER ["SERVER_PORT"]; $ ScriptName = $ _ SERVER ["SCRIPT_NAME"]; $ QueryString = $ _ SERVER ["QUERY_STRING"]; $ serverip = $ _ SERVER ["REMOTE_ADDR"]; $ Url = "http ://". $ ServerName; if ($ ServerPort! = "") {$ Url = $ Url. ":". $ ServerPort;} $ Url = $ Url. $ ScriptName; if ($ QueryString! = "") {$ Url = $ Url ."? ". $ QueryString;} $ GetLocationURL = $ Url; $ agent = $ _ SERVER ["HTTP_USER_AGENT"]; $ agent = strtolower ($ agent); $ Bot = ""; if (strpos ($ agent, "bot")>-) {$ Bot = "Other Crawler";} if (strpos ($ agent, "googlebot")> -) {$ Bot = "Google";} if (strpos ($ agent, "mediapartners-google")>-) {$ Bot = "Google Adsense ";} if (strpos ($ agent, "baiduspider")>-) {$ Bot = "Baidu";} if (strpos ($ agent, "sogou spider")> -) {$ Bot = "Sogou";} if (strpos ($ agent, "yahoo")>-) {$ Bot = "Yahoo! ";} If (strpos ($ agent," msn ")>-) {$ Bot =" MSN ";} if (strpos ($ agent," ia_archiver ")> -) {$ Bot = "Alexa";} if (strpos ($ agent, "iaarchiver")>-) {$ Bot = "Alexa";} if (strpos ($ agent, "sohu")>-) {$ Bot = "Sohu";} if (strpos ($ agent, "sqworm")>-) {$ Bot = "AOL ";} if (strpos ($ agent, "yodaoBot")>-) {$ Bot = "Yodao";} if (strpos ($ agent, "iaskspider")> -) {$ Bot = "Iask";} require (". /dbinfo. php "); date_defau Lt_timezone_set ('prc'); $ shijian = date ("Y-m-d h: I: s", time ()); // connect to the MySQL server $ connection = mysql_connect ($ host, $ username, $ password); if (! $ Connection) {die ('not ccted: '. mysql_error ();} // sets the active MySQL database $ db_selected = mysql_select_db ($ database, $ connection); if (! $ Db_selected) {die ('can \'t use db :'. mysql_error ();} // insert data to the database $ query = "insert into crawler (FIG, fig, fig, FIG) values ('$ Bot', '$ shijian ', '$ getlocationurl',' $ serverip') "; $ result = mysql_query ($ query); if (! $ Result) {die ('invalid query: '. mysql_error () ;}?>

Now, you can access the database to know when the spider crawls your page.

View sourceprint? <? Phpinclude '. /robot. php '; include '.. /library/page. class. php '; $ page = $ _ GET ['page']; include '.. /library/conn_new.php '; $ count = $ mysql-> num_rows ($ mysql-> query ("select * from crawler"); $ pages = new PageClass ($ count ,, $ _ GET ['page'], $ _ SERVER ['php _ SELF ']. '? Page = {page} '); $ SQL = "select * from crawler order by"; $ SQL. = "crawler_date desc limit ". $ pages-> page_limit. ",". $ pages-> myde_size; $ result = $ mysql-> query ($ SQL);?> <Table width = ""> <thead> <tr> <td bgcolor = "# CCFFFF"> </td> <td bgcolor = "# CCFFFF" align = "center" style = "color: # "> crawler access time </td> <td bgcolor =" # CCFFFF "align =" center "style =" color: # "> crawler classification </td> <td bgcolor =" # CCFFFF "align =" center "style =" color: # "> crawler IP </td> <td bgcolor =" # CCFFFF "align =" center "style =" color: # "> crawler access URL </td> </tr> </thead> <? Phpwhile ($ myrow = $ mysql-> fetch_array ($ result) {?> <Tr> <td width = "">  </td> <td width =" "style =" font-family: Georgia "> <? Echo $ myrow ["crawler_date"]?> </Td> <td width = "" style = "color: # FA"> <? Echo $ myrow ["crawler_category"]?> </Td> <td width = ""> <? Echo $ myrow ["crawler_IP"]?> </Td> <td width = ""> <? Echo $ myrow ["crawler_url"]?> </Td> </tr> <? Php }?> </Table> <? Php echo $ pages-> myde_write ();?>

The above code is the PHP code implementation crawler record-all the content of the super-managed, I hope to help you.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.