PHP Code implementation crawler record-super-useful, PHP code crawler _php Tutorial

Source: Internet
Author: User

PHP code Implementation crawler Records-super-useful, PHP code crawler


The realization of the crawler record text from the creation of the crawler database, robot.php records the visiting crawler to insert information into the database crawler, and then from the database can get all the crawler information. The implementation code is detailed as follows:

Database design

CREATE TABLE crawler  (   crawler_id bigint () unsigned NOT NULL Auto_increment primary key, Crawler_category varchar () not NULL, crawler_date datetime is NOT NULL default '--:: ', Crawler_url varchar () is not NULL, CRAWLER_IP varchar () is NOT null ) default Charset=utf;

The following file robot.php logs the visiting crawler and writes the information to the database:

<?php $ServerName = $_server["SERVER_NAME"];  $ServerPort = $_server["Server_port"];  $ScriptName = $_server["Script_name"];  $QueryString = $_server["Query_string"];  $serverip = $_server["REMOTE_ADDR"]; $URL = "http://". $ServerName;  if ($ServerPort! = "") {$Url = $Url. ":". $ServerPort;} $URL = $Url. $ScriptName; if ($QueryString! = "") {$Url = $Url. "?". $QueryString; } $GetLocationURL = $Url;  $agent = $_server["Http_user_agent"]; $agent =strtolower ($agent); $Bot = "";    if (Strpos ($agent, "bot") >-) {$Bot = "other Crawler";} if (Strpos ($agent, "Googlebot") >-) {$Bot = "Google";} if (Strpos ($agent, "Mediapartners-google") >-) {$Bot = "Google Adsense";} if (Strpos ($agent, "Baiduspider") >-) {$ Bot = "Baidu"; if (Strpos ($agent, "Sogou spider") >-) {$Bot = "Sogou";} if (Strpos ($agent, "Yahoo") >-) {$Bot = "Yahoo!";} if ( Strpos ($agent, "MSN") >-) {$Bot = "MSN";} if (Strpos ($agent, "Ia_archiver") >-) {$Bot = "Alexa";} if (Strpos ($agen T, "Iaarchiver");-) {$Bot = "Alexa";} if (Strpos ($agent, "Sohu") >-) {$Bot = "Sohu";} if (Strpos ($agent, "Sqworm") >-) {$Bot = "AOL";} if (Strpos ($agent  , "Yodaobot") >-) {$Bot = "Yodao";} if (Strpos ($agent, "Iaskspider") >-) {$Bot = "iask";} require ("./dbinfo.php");  Date_default_timezone_set (' PRC '); $shijian =date ("y-m-d h:i:s", Time ()); Connect to MySQL Server $connection = mysql_connect ($host, $username, $password); if (! $connection) {die (' Connected: '. Mysql_error ())}//Set the active MySQL database $db _selected = mysql_select_db ($databas E, $connection); if (! $db _selected) {die (' can\ ' t use DB: '. mysql_error ())}//Insert data into the database $query = "INSERT INTO crawler (Crawler_categ Ory, Crawler_date, Crawler_url, crawler_ip) VALUES (' $Bot ', ' $shijian ', ' $GetLocationURL ', ' $serverip '); $result = mysql_query ($query); if (! $result) {die (' Invalid query: '. mysql_error ());}? >

Success, now access the database to know when where the spider crawled over your page.

View Sourceprint?<?phpinclude './robot.php '; include '. /library/page. Class.php '; $page = $_get[' page '];include '. /library/conn_new.php '; $count = $mysql-Num_rows ($mysql, Query ("SELECT * from crawler")); $pages = new Pageclass ( $count, $_get[' page '],$_server[' php_self '. Page={page} '); $sql = "SELECT * from crawler order BY"; $sql. = "Crawler_date desc limit". $pages, Page_limit. ",". $pag ES-myde_size; $result = $mysql, query ($sql);? >
 
 <?phpwhile ($myrow = $mysql, Fetch_array ($result)) {?>
  
  <?php}?> 
  
Crawler access Time Reptile classification Reptile IP URL for crawler access
<? echo $myrow ["Crawler_date"]?> <? echo $myrow ["Crawler_category"]?> <? echo $myrow ["Crawler_ip"]?> <? echo $myrow ["Crawler_url"]?>
<?php Echo $pages-myde_write ();? >

The above code is the PHP code to implement the crawler record-the whole content of super-work, I hope to help you.

http://www.bkjia.com/PHPjc/1041333.html www.bkjia.com true http://www.bkjia.com/PHPjc/1041333.html techarticle PHP code to implement the crawler record-super-useful, PHP code crawler to implement the crawler record text from the creation of the crawler database, robot.php records visiting crawlers to insert information into the database Crawle ...

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.