Original php collector v1.02

Source: Internet
Author: User
It is used for website collection and is easy to use: it supports paging collection, image download, filtering, and so on. There are not many instructions. It is limited to the Second Development of php. The previous code snippets are deleted first. Please download the attachment directly, if you need the collection service, Can you contact me for PHP? Php *** capture ** @ authorAdministrator * @ example $ configarray (* host server location

It is used for website collection and is easy to use: it supports paging collection, image download, filtering, and so on. There are not many instructions. It is limited to the Second Development of php. The previous code snippets are deleted first. Please download the attachment directly, if you need the collection service, Can you contact me for PHP? Php/*** capture logs * @ author Administrator * @ example $ config = array (* 'host' = 'server location

It is used for website collection and is easy to use: it supports paging collection, image download, filtering, and so on. There are not many instructions. It is limited to the Second Development of php. The previous code snippets are deleted first. Please download the attachment directly, if you need the collection service, contact me
PHP
 'Server address', * 'LIST' => array (* 'items '=> array (regular expression group), * 'page _ url' =>' regular expression of the paging address, $1 is the link, and the number displayed in $2 is ', * 'page _ size' => 'page size', * 'page _ url_rule' =>' obtains the regular number of page numbers, $1 must be a number, * 'page _ limit '=> Number. the maximum number of pages to be scanned. If not specified, then, only the page number of the visible fan is scanned. * 'this _ detail_callback' => 'call back the data on the details page ', * 'list _ detail_url '=> 'specifies the address of the detail page in the items in the list' *) ** details => array (* All rules on the details page, see items Structure Description *), ** time_limit => array ('rule' => corresponding group name, start => superstart time, end => end time ), * num_limit => How many data records are obtained *) ** items structure resolution: array (* 'attribute name' => array ('rule' => regular expression, array in multiple cases, type => '1-text, 2-remote request, 3-> 'sub-rule list items ', 4 => 'sub-config configuration', * replace => replace result, in the form of a callback function or using an array (from => 'regular expression', 'to' => replacement character), 'Multi '=> whether to collect multiple data records ),*) */set_time_limit (0); define ('in _ web', true); date_default_timezone_set ('prc'); include ('collector/init. php '); $ htmlFilter = '/
 
  
] * \/> | (Onclick | onmouseover | onmouseout | onblur) = \ "[^ \"] + \ ") |
  |
  
   
] *> | <\/P> |
   
    
] *> (. + ?) <\/Style> |
    
     
] *> (. + ?) <\/Emded> |
     
      
] *> (. + ?) <\/Object> |
      
        ] *> (.*?) <\/Script> |
       
         ] *> (. + ?) <\/Noscript> |] *> | <\/a>/is '; $ config = array ('host' => 'HTTP: // news.wto168.net/zixun/', 'LIST' => array ('items '=> array ('time' => array ('rule' =>'/>. * ([0-9] {4}-[0-9] {1, 2}-[0-9] {1, 2} \ s * [0-9] {1, 2 }: [0-9] {1, 2}: [0-9] {1, 2}) <\/li>/I ', 'Multi' => true ), 'link' => array ('rule' => '// I', 'Multi '=> true ,), 'title' => array ('rule' => '/([^>] +) <\/a>/I', 'Multi '=> true, 'replace '=> array ('from' => '/【. +]/',' to '=> ''),), 'list _ detail_url' => 'link', 'page _ url' => '/
        ] *> \ D + <\/option>/I ', 'page _ url_rule' =>'/_ (\ d + )\. html/', 'page _ limit' => 10,), 'details' => array ('content' => array ('rule' => '/

(. + ?)

/Is ', 'keep _ html' => true, 'replace' => array ('from' => $ htmlFilter, 'to' => '')),), 'list _ url' => '/^ http: \/news \. wto168 \. net \/zixun \/list/', 'detail _ url' =>'/^ http: \/news \. wto168 \. net \/zixun \/. *\. html/I ', 'Time _ limit' => array ('rule' => 'time ', 'start' => date ('Y-m-d'),); $ c = new collector ($ config); $ url = 'HTTP: // news.wto168.net/zixun/list_56_1.html'{}res = $ c-> collect ($ url); print_r ($ res);?>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.