Powerful PHP tool: Snoopy trial

Source: Internet
Author: User
PHP collection tool: Snoopy trial experience? What is Snoopy? (Download snoopy) Snoopy is a php class used to imitate the functions of a web browser. it can complete the task of obtaining webpage content and sending forms. Some features of Snoopy: * convenient crawling of webpage content * convenient crawling of webpage text. PHP collection tool: Snoopy trial

?

What is Snoopy? (Download snoopy) Snoopy is a php class used to imitate the functions of a web browser. it can complete the task of obtaining webpage content and sending forms. Some features of Snoopy: * convenient crawling of webpage content * convenient crawling of webpage text content (remove HTML tags) * convenient webpage crawling * support proxy Host * support basic username/password verification * support user_agent, referer (lailu), cookies and header content (header file) * support browser redirection, and can control the redirection depth * extend the link in the webpage to a high-quality url (default) * facilitate data submission and obtain returned values * support tracking HTML framework (added in v0.92) * Can I transfer cookies when I try again (v0.92 is added )? If you want to know more deeply, Google it yourself. Below are a few simple examples: 1 Get the PHP code of the specified url content
$ Url = "http://www.taoav.com"; include ("snoopy. php "); $ snoopy = new Snoopy; $ snoopy-> fetch ($ url); // Obtain all content echo $ snoopy-> results; // display the result $ snoopy-> fetchtext // Obtain the text content (remove the html code) $ snoopy-> fetchlinks // Obtain the link $ snoopy-> fetchform // Obtain the form
2. submit PHP code in a form
$ Formvars ["username"] = "admin"; $ formvars ["pwd"] = "admin"; $ action = "http://www.taoav.com "; // form submission address $ snoopy-> submit ($ action, $ formvars); // $ formvars is the submitted array echo $ snoopy-> results; // obtain the result returned after the form is submitted $ snoopy-> submittext; // after the form is submitted, only the html-removed text $ snoopy-> submitlinks is returned. // after the form is submitted, only the link is returned.
? Now that you have submitted the form, you can do a lot of things. next we will disguise the ip address and the browser 3 as the PHP code.
$ Formvars ["username"] = "admin"; $ formvars ["pwd"] = "admin"; $ action = "http://www.taoav.com"; include "snoopy. php "; $ snoopy = new Snoopy; $ snoopy-> cookies [" PHPSESSID "] = 'fc0000b1918bd522cc863f000090e6fff7 '; // disguise sessionid $ snoopy-> agent =" (compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98) "; // camouflage browser $ snoopy-> referer =" http://www.only4.cn "; // disguise the source page address http_referer $ snoopy-> rawheaders ["Pragma"] = "no-cache "; // cache http header information $ snoopy-> rawheaders ["X_FORWARDED_FOR"] = "127.0.0.101"; // disguise ip $ snoopy-> submit ($ action, $ formvars ); echo $ snoopy-> results;
?

  1. In the past, we could disguise session as a web browser and ip address, and haha could do a lot of things.
For example, you can vote for an ip address with a verification code. Ps: Here, the disguised ip address is actually an http header. Therefore, the ip address obtained through REMOTE_ADDR cannot be disguised, but those obtained through the http header (which can prevent proxy) you can create an ip address by yourself. Let's briefly describe how to use the verification code. First, use a common browser to view the page, find the sessionid corresponding to the verification code, and write down the sessionid and the verification code value. Then, use snoopy to forge the verification code. Principle: because it is the same sessionid, the verification code obtained is the same as the one entered for the first time. 4. sometimes we may need to forge more things. snoopy comes up with PHP code for us.
$ Snoopy-> proxy_host = "www.only4.cn"; $ snoopy-> proxy_port = "8080"; // use the proxy $ snoopy-> maxredirs = 2; // redirect times $ snoopy-> expandlinks = true; // whether to enable the full link is frequently used during Collection. // for example, if the link is/images/taoav.gif, you can change it to its full-Link Scheme $ snoopy-> maxframes = 5 // maximum allowed frame. // when capturing the framework, $ snoopy-> results returns an array $ snoopy-> error // returns an error message.
? The above basic usage is understood. I will demonstrate the following example: PHP code?
 Agent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv: 1.9.0.5) Gecko/2008120122 Firefox/3.0.5 FirePHP/0.2.1 "; // This is the browser information. you can view the cookie in the browser (ps: $ _ SERVER can view the browser information) $ snoopy-> referer =" http://bbs.phpchina.com/ Index. php "; $ snoopy-> expandlinks = true; $ snoopy-> rawheaders [" COOKIE "] =" _ utmz = 17229162.1227682761.29.7.utmccn = (referral) | utmcsr = phpchina.com | utmcct =/html/index.html | utmcmd = referral; cdbphpchina_smile = 1D2D0D1; records = 2592000; _ utma = records; _ utmz = records = (referral) | utmcsr = localhost: 8080 | utmcct =/test3.php | utmcmd = referral; _ utma = Hangzhou; uchome_loginuser = sinopf; xscdb_cookietime = 2592000; _ utmc = 17229162; _ utmb = 17229162; cdbphpchina_sid = EX5w1V; _ utmc = 233700831; Fingerprint = 17; %%auth; xscdb_auth = 8w.rayhkpql49ems % signature % digest % 2 BHSZrVKgqPOttHVr % Digest; cdbphpchina_onlineusernum = 3721 "; $ snoopy-> fetch (" http://bbs.phpchina.com/ Forum-17-1.html "); $ n = ereg_replace (" href = \ "", "href = \" http://bbs.phpchina.com/ ", $ Snoopy-> results); echo ereg_replace (" src = \ "", "src = \" http://bbs.phpchina.com/ ", $ N);?>
? This is a simulated login to the PHPCHINA Forum. First, you need to view your browser information: echo? Var_dump ($ _ SERVER); this code shows the information of your browser?
Copy the content behind $ _ SERVER ['http _ USER_AGENT '] and stick it to $ snoopy-> agent. then you need to check your own
COOKIE. after logging on to the Forum with your own account, enter
Javascript: document. write (document. cookie). press enter to view your cookie information and copy and paste it.
To the back of $ snoopy-> rawheaders ["COOKIE"] =. (My cookie information has been deleted for security reasons)


Then pay attention:


# $ N = ereg_replace ("href = \" "," href = \ "http://bbs.phpchina.com/", $ snoopy-> results );?


# Echo ereg_replace ("src = \" "," src = \ "http://bbs.phpchina.com/", $ n );


Because all the HTML source code addresses of the collected content are relative links, replace them with absolute links so that you can reference the Forum images and css styles. Re: http://zzdboy1616.blog.163.com/blog/static/430670762009213111712876?

?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.