PHP Collection Class Snoopy.class.php Introduction and download

Source: Internet
Author: User
Tags array header html tags net php class php and php download php introduction

Snoopy is a very powerful PHP class that can be used to simulate a browser to get the Web content and send a form to the task. The following is a detailed description of the characteristics of Snoopy.class.php and some common uses.

Official website: http://snoopy.sourceforge.net/(can not open foreign web site of the use of the day-line browser)

Download Address: http://sourceforge.net/projects/snoopy/

Download Address: PHP Collection library Snoopy.class.php download

Here are some of the features of Snoopy:

1, capture the content of the page fetch
2, crawl the text content of the webpage (remove HTML tag) fetchtext
3, crawl the link of the webpage, form Fetchlinks fetchform
4, support Agent Host
5, support the basic user name/password verification
6, support Settings user_agent, Referer (routing), cookies and header content (header file)
7, support browser redirection, and can control the redirection depth
8, can extend the link in the webpage to the High quality URL (the default)
9, submit the data and get the return value
10, support the tracking HTML frame (v0.92 added)
11. Transfer cookies when supporting redirection

Note: The use of Snoopy.class.php requirements PHP4 above can be, because it is a PHP class, do not need to expand support, the server does not support the best choice when curl.

The following are some of the commonly used class method introductions:

Fetch ($URI)

This is the method used to crawl the content of a Web page. $URI parameter is the URL of the crawled Web page, and the result of the crawl is stored in the $this->results. If you're grabbing a frame, Snoopy will track each frame into an array and deposit it into the $this->results.

Fetchtext ($URI)

This method is similar to fetch (), except that this method removes the HTML tag and other extraneous data and returns only the text content in the page.

Fetchform ($URI)

This method is similar to fetch (), except that this method removes the HTML tags and other extraneous data and returns only the form content (form) in the Web page.

Fetchlinks ($URI)

This method is similar to fetch (), except that this method removes the HTML tags and other extraneous data and returns only the links in the Web page. By default, relative links are automatically completed and converted to full URLs.

Submit ($URI, $formvars)

This method sends a confirmation form to the link address specified by the $url. $formvars is an array of stored form parameters.

Submittext ($URI, $formvars)

This method is similar to submit (), the only difference is that this method will remove HTML tags and other unrelated data, only return to the page after landing text content.

Submitlinks ($URI)

This method is similar to submit (), except that this method removes the HTML tags and other extraneous data and returns only the links in the Web page. By default, relative links are automatically completed and converted to full URLs.

Class Properties: (The default value is in parentheses)

$host Connected hosts
$port Connected Ports
$proxy _host used by the proxy host, if any
$proxy the proxy host port used by _port, if any
$agent User Agent Camouflage (Snoopy v0.1)
$referer routing information, if any.
$cookies cookies, if any.
$rawheaders Other header information, if any.
$maxredirs Maximum number of redirects, 0 = not allowed (5)
$offsiteok whether or not to allow redirects off-site. (true)
$expandlinks whether the link is fully filled with the full address (true)
$user authenticated user name, if any
$pass authenticated user name, if any
$accept http Accept type (image/gif, Image/x-xbitmap, Image/jpeg, Image/pjpeg, */*)
$error where the error is, if any.
$response _code Response code returned from the server
$headers header information returned from the server
$maxlength Longest return data length
$read _timeout Read operation timeout (requires PHP 4 Beta 4+) set to 0 for no timeout
$timed _out If a read operation times out, this property returns True (Requires PHP 4 Beta 4+)
Maximum number of frames $maxframes allowed to track
$status the state of the HTTP being crawled
$temp The Temporary Files directory (/tmp) that the _dir Web server can write to
$curl _path Curl Binary directory, set to False if no curl binary

The following are some common usage examples:

(1) Crawl the text on the first page of a PHP Programmer's note site

<?php
include ' Snoopy.class.php ';
$snoopy =new Snoopy;
$snoopy->fetchtext ("http://www.Alixixi.com");
Echo $snoopy->results;

(2) Crawl the PHP Programmer's Note Site page of all the links

<?php
include ' Snoopy.class.php ';
$snoopy =new Snoopy;
$snoopy->fetchlinks ("http://www.Alixixi.com");
Print_r ($snoopy->results);

(3) To get access to all the network needs to send what field, the target address is what

<?php
include ' Snoopy.class.php ';
$snoopy =new Snoopy;
$snoopy->fetchform ("http://www.renren.com/PLogin.do");
Print_r ($snoopy->results);

(4) Analog landing Renren

<?php
set_time_limit (0);
Require "Snoopy.class.php";
$snoopy =new Snoopy ();
$snoopy->referer= ' http://www.renren.com/';
$snoopy->agent= "mozilla/5.0 (Windows NT 6.1; rv:22.0) gecko/20100101 firefox/22.0";
$submit _vars[' email '] = ' login account ';
$submit _vars[' password '] = ' login password ';
$url = ' http://www.renren.com/PLogin.do ';//Login data submitted URL address
$snoopy->submit ($url, $submit _vars);
$snoopy->fetch ("http://www.renren.com/");/the page data you want to get
echo $snoopy->results;//www.alixixi.com

Articles that you may be interested in

    • PHP using curl to implement multithreaded classes, PHP curl multi-Threaded download pictures
    • PHP Snoopy Collection Class Introduction
    • Deep php: object-oriented, pattern, and Practice (3rd edition). pdf download
    • PHP uses Curl functions to implement multi-threaded crawl Web pages and download files
    • Use PHP function memory_get_usage to get current PHP memory consumption to achieve program performance optimization
    • PHP limits File Download speed function
    • PHP implementation of the file bulk compression package download
    • Powerful PHP Image processing class (watermark, transparency, zoom, sharpen, rotate, flip, cut, invert color)


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.