PHP Collection Class Snoopy detailed introduction (Snoopy use tutorial) _php instance

Source: Internet
Author: User
Tags curl get ip html tags php class response code

Snoopy is a PHP class, used to simulate the browser's function, you can get Web content, send forms, can be used to develop a number of acquisition procedures and thief programs, this article details the use of Snoopy tutorial.

Some features of Snoopy:
fetching content from a Web page fetch
Crawl the text content of a Web page (remove HTML tags) fetchtext
Crawl page links, form fetchlinks Fetchform
Support Agent Host
Support for basic user name/password Authentication
Support Settings user_agent, Referer (routing), cookies and header content (header file)
Supports browser redirection and can control redirect depth
Can extend the link in the webpage to the High quality URL (default)
Submit the data and get the return value
Support for tracking HTML frames
Pass cookies when supporting redirection
Require PHP4 above it's OK. Because it is a PHP class without expanding the support server does not support the best choice of curl time,

Snoopy class methods and examples:

Fetch ($URI)
This is the method used to crawl the content of a Web page.
The $URI parameter is the URL address of the crawled Web page.
The results of the crawl are stored in the $this->results.
If you're grabbing a frame, Snoopy will track each frame into an array and deposit it into the $this->results.

Fetchtext ($URI)
This method is similar to fetch (), except that this method removes the HTML tag and other extraneous data and returns only the text content in the page.

Fetchform ($URI)
This method is similar to fetch (), except that this method removes the HTML tags and other extraneous data and returns only the form content (form) in the Web page.

Fetchlinks ($URI)
This method is similar to fetch (), except that this method removes the HTML tags and other extraneous data and returns only the links in the Web page.
By default, relative links are automatically completed and converted to full URLs.

Submit ($URI, $formvars)
This method sends a confirmation form to the link address specified by the $url. $formvars is an array of stored form parameters.

Submittext ($URI, $formvars)
This method is similar to submit (), the only difference is that this method will remove HTML tags and other unrelated data, only return to the page after landing text content.

Submitlinks ($URI)
This method is similar to submit (), except that this method removes the HTML tags and other extraneous data and returns only the links in the Web page.
By default, relative links are automatically completed and converted to full URLs.


Snoopy Collection Class attributes: (the default value is in parentheses)

$hostConnected hosts
$portConnected ports
$proxy _hostUse of the proxy host, if any
$proxy _portThe proxy host port used, if any
$agentUser Agent Camouflage (Snoopy v0.1)
$refererRouting information, if any.
$cookies CookiesIf there is one.
$rawheadersOther header information, if any.
$maxredirsMaximum number of redirects, 0 = not allowed (5)
$offsiteokWhether or not to allow redirects off-site. (true)
$expandlinksWhether the link is fully completed (true)
$userAuthenticated user name, if any
$passAuthenticated user name, if any
$acceptHTTP Accept type (image/gif, Image/x-xbitmap, Image/jpeg, Image/pjpeg, */*)
$errorWhere the error is, if any.
$response _codeResponse code returned from the server
$headersHeader information returned from the server
$maxlengthMaximum length of data returned
$read _timeoutRead operation timeout (requires PHP 4 Beta 4+) set to 0 for no timeout
$timed _outIf a read operation times out, this property returns True (Requires PHP 4 Beta 4+)
$maxframesMaximum number of frames allowed for tracking
$statusThe state of the crawled HTTP
$temp _dirTemporary files directory (/tmp) that the Web server can write to
$curl _pathCURL binary directory, set to False if no CURL binary

Here is an example:

Copy Code code as follows:

Include "Snoopy.class.php";
$snoopy = new Snoopy;

$snoopy->proxy_host = "Http://www.jb51.net";
$snoopy->proxy_port = "80";

$snoopy->agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98) ";
$snoopy->referer = "Http://www.jb51.net";

$snoopy->cookies["SessionID"] = 238472834723489l;
$snoopy->cookies["FavoriteColor"] = "RED";

$snoopy->rawheaders["Pragma"] = "No-cache";

$snoopy->maxredirs = 2;
$snoopy->offsiteok = false;
$snoopy->expandlinks = false;

$snoopy->user = "Joe";
$snoopy->pass = "Bloe";

if ($snoopy->fetchtext ("Http://www.jb51.net"))
{
echo "<PRE>". Htmlspecialchars ($snoopy->results). " </pre>\n ";
}
Else
echo "Error fetching document:". $snoopy->error. " \ n ";

Get the specified URL content

Copy Code code as follows:
<?php
$url = "Http://www.jb51.net";
Include ("snoopy.php");
$snoopy = new Snoopy;
$snoopy->fetch ($url); Get all content
Echo $snoopy->results; Show results
You can choose the following
$snoopy->fetchtext//Get text content (remove HTML code)
$snoopy->fetchlinks//Get Links
$snoopy->fetchform//Get the form
?>

Form submission

Copy Code code as follows:
<?php
$formvars ["username"] = "admin";
$formvars ["pwd"] = "admin";
$action = "Http://www.jb51.net";//</a> form submission Address
$snoopy->submit ($action, $formvars);//$formvars for the submitted array
Echo $snoopy->results; Get the results of a return after a form is submitted
You can choose the following
$snoopy->submittext; Only text that is stripped of HTML is returned after submission
Only return link after $snoopy->submitlinks;//commit
?>

Now that you've submitted a form, you can do a lot of things. Next we're going to disguise the IP, camouflage browser

Camouflage browser

Copy Code code as follows:
<?php
$formvars ["username"] = "Lanfengye";
$formvars ["pwd"] = "Lanfengye";
$action = "Http://www.jb51.net";
Include "snoopy.php";
$snoopy = new Snoopy;
$snoopy->cookies["PHPSESSID"] = ' fc106b1918bd522cc863f36890e6fff7 '; Camouflage SessionID
$snoopy->agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98) "; Camouflage browser
$snoopy->referer = "Http://www.jb51.net"; Camouflage Source page Address Http_referer
$snoopy->rawheaders["Pragma"] = "No-cache"; Cache HTTP Header Information
$snoopy->rawheaders["x_forwarded_for"] = "127.0.0.101"; Camouflage IP
$snoopy->submit ($action, $formvars);
Echo $snoopy->results;
?>

Originally we can disguise the session camouflage browser, camouflage IP, haha can do a lot of things.
For example, with verification code, verify IP voting, you can keep casting.
PS: Here camouflage IP, in fact, is the camouflage HTTP head, so the general through the REMOTE_ADDR to obtain IP is not disguised,
Instead, those who get IP through HTTP headers (which can prevent proxies) can make their own IP.
about how to verify the code, simply:
First use the normal browser, view the page, find the corresponding SessionID code,
Also note the SessionID and the Verification code values,
Next, use Snoopy to forge.
Principle: Because it is the same SessionID, the verification code obtained is the same as the first time input.

Sometimes we may need to forge more stuff, Snoopy completely for us to think about

<?php
$snoopy->proxy_host = "Http://www.jb51.net";
$snoopy->proxy_port = "8080"; Using agents
$snoopy->maxredirs = 2; Number of redirects
$snoopy->expandlinks = true; Whether the full link in the collection of time often used
For example, a link to/images/taoav.gif can be changed to its full link <a href= "http://www.jb51.net/images/taoav.gif" >http://www.jb51.net/images /taoav.gif</a>
$snoopy->maxframes = 5//maximum number of frames allowed
Note that when you crawl the frame $snoopy->results returns an array
$snoopy->error//Return error message
?>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.