Powerful php collection tool: Snoopy

Source: Internet
Author: User

Source Address: http://sourceforge.net/projects/snoopy/

 

What Is Snoopy?

Snoopy is a PHP class used to imitate the functions of a web browser. It can obtain webpage content and send forms.

Some features of Snoopy:

* Convenient webpage content capture
* Convenient crawling of webpage text (HTML tag removal)
* Convenient webpage crawling
* Proxy host supported
* Basic user name/password verification is supported.
* User_agent, Referer, cookies, and header content can be set)
* Supports browser redirection and controls steering depth
* Extends links in a webpage to high-quality URLs (default)
* Easy to submit data and obtain returned values
* Supports tracking HTML frameworks (added in v0.92)
* Supports transmitting cookies during redirection (added in v0.92)

If you want to know more deeply, Google it yourself. Here are a few simple examples:

1. Get the specified URL content

PHPCode
$ Url = "http://www.taoav.com ";
Include ("Snoopy. php ");
$ Snoopy = new Snoopy;
$ Snoopy-> fetch ($ URL); // get all content
Echo $ Snoopy-> results; // display the result
// Optional
$ Snoopy-> fetchtext // get text content (remove HTML code)
$ Snoopy-> fetchlinks // obtain the link
$ Snoopy-> fetchform // obtain the form
2. Form submission

PHP code
$ Formvars ["username"] = "admin ";
$ Formvars ["PWD"] = "admin ";
$ Action = "http://www.taoav.com"; // form submission address
$ Snoopy-> submit ($ action, $ formvars); // $ formvars is the submitted Array
Echo $ Snoopy-> results; // obtain the result returned after the form is submitted.
// Optional
$ Snoopy-> submittext; // after submission, only the HTML-removed text is returned.
$ Snoopy-> submitlinks; // after submission, only the link is returned.
Since the form has been submitted, we can do a lot of things. Next we will disguise the IP address and the browser.

3 disguise

PHP code
$ Formvars ["username"] = "admin ";
$ Formvars ["PWD"] = "admin ";
$ Action = "http://www.taoav.com ";
Include "Snoopy. php ";
$ Snoopy = new Snoopy;
$ Snoopy-> Cookies ["PHPSESSID"] = 'fc0000b1918bd522cc863f000090e6fff7 '; // disguise sessionid
$ Snoopy-> agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98)"; // camouflage Browser
$ Snoopy-> Referer = "http://www.only4.cn"; // camouflage Source Page address http_referer
$ Snoopy-> rawheaders ["Pragma"] = "no-Cache"; // The HTTP header information of the cache
$ Snoopy-> rawheaders ["x_forwarded_for"] = "127.0.0.101"; // disguise IP Address
$ Snoopy-> submit ($ action, $ formvars );
Echo $ Snoopy-> results;

In the past, we could disguise session as a Web browser and IP address, and Haha could do a lot of things.
For example, you can vote for an IP address with a verification code.
PS: here, the disguised IP address is actually an HTTP header, so the IP address obtained through remote_addr cannot be disguised,
Instead, IP addresses obtained through HTTP headers (which can prevent proxies) can be created by themselves.
Let's briefly describe how to use the Verification Code:
First, use a normal browser to view the page and find the sessionid corresponding to the Verification code,
Write down sessionid and verification code value at the same time,
Next, we will use Snoopy to forge.
Principle: because it is the same sessionid, the verification code obtained is the same as the one entered for the first time.

4. Sometimes we may need to forge more things. Snoopy thought of it for us.

PHP code
$ Snoopy-> proxy_host = "www.only4.cn ";
$ Snoopy-> proxy_port = "8080"; // use a proxy

$ Snoopy-> maxredirs = 2; // redirect times

$ Snoopy-> expandlinks = true; // whether to complete the link is often used during collection
// For example, if the link is/images/taoav.gif, you can change it to its full link upload/200905122369246022.gif. in this case, you can use the ereg_replace function to replace it when outputting the final output.

$ Snoopy-> maxframes = 5 // maximum number of frames allowed

// When capturing the frame, $ Snoopy-> results returns an array.

$ Snoopy-> error // error message returned

The above basic usage is understood, and I will demonstrate it once below:

PHP code
<?
// Echo var_dump ($ _ server );
Include ("Snoopy. Class. php ");
$ Snoopy = new Snoopy;
$ Snoopy-> agent = "Mozilla/5.0 (windows; U; Windows NT 5.1; ZH-CN; RV: 1.9.0.5) Gecko/2008120122 Firefox/3.0.5 firephp/0.2.1 "; // This is the browser information. In the browser you used to view the cookie, you can use the browser information (PS: $ _ server can view the browser information)
$ Snoopy-> Referer = "http://bbs.phpchina.com/index.php ";
$ Snoopy-> expandlinks = true;
$ Snoopy-> rawheaders ["cookie"] = "_ utmz = response = (referral) | utmcsr = phpchina.com | utmcct =/html/index.html | utmcmd = referral; cdbphpchina_smile = 1d2d0d1; records = 2592000; _ utma = records; _ utmz = records = (referral) | utmcsr = localhost: 8080 | utmcct =/test3.php | utmcmd = referral; _ utma = bytes; uchome_loginuser = sinopf; xscdb_cookietime = 2592000; _ utmc = 17229162; _ utmb = 17229162; cdbphpchina_sid = ex5w1v; _ utmc = 233700831; bytes = 17; authorization % authorization; xscdb_auth = 8127rayhkpql49ems % 2fyhlbf3c6clz % 2b2idsk4bexjwbqr % 2 bhszrvkgqpotthvr % authorization; cdbphpchina_onlineusernum = 3721 ";
$ Snoopy-> fetch ("http://bbs.phpchina.com/forum-17-1.html ");
$ N = ereg_replace ("href = \" "," href = \ "http://bbs.phpchina.com/", $ Snoopy-> results );
Echo ereg_replace ("src = \" "," src = \ "http://bbs.phpchina.com/", $ N );
?>
This is to simulate the phpchina Forum login process, first you need to view your browser information: Echo var_dump ($ _ server); this code can see your browser information, copy the content behind $ _ server ['HTTP _ user_agent '], stick it to the $ Snoopy-> Agent location, and check your cookie, after logging on to the forum with your own account, enter javascript: Document in the address bar of your browser. write (document. cookie), press enter to view your cookie information, copy and paste it to the back of $ Snoopy-> rawheaders ["cookie"] =. (My cookie information has been deleted for security reasons)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.