PHP Acquisition Tool: Snoopy trial experience
?
What is Snoopy? (Download Snoopy)
Snoopy is a PHP class that mimics the functionality of a Web browser, which accomplishes the task of getting web content and sending forms.
Some features of Snoopy:
* Easy to crawl the content of the webpage
* Easy to crawl Web page text content (remove HTML tags)
* Easy to crawl Web links
* Support Agent Host
* Support basic username/password Verification
* Support Settings User_agent, Referer (routing), cookies and header content (header file)
* Supports browser steering and can control steering depth
* Can expand the link in the Web page into a high-quality URL (default)
* Easy to submit data and get return value
* Support for tracking HTML frames (v0.92 added)
* Support for re-steering when transmitting cookies (v0.92 increase)
?
If you want to know more deeply, you Google it. Here are a few simple examples:
1 get the specified URL content
PHP code
$url = "http://www.taoav.com"; Include ("snoopy.php"); $snoopy = new Snoopy; $snoopy->fetch ($url); Get all content echo $snoopy->results;//Display results $snoopy->fetchtext//Get text content (remove HTML code) $snoopy Fetchlinks//Get link $snoopy->fetchform//Get form
2 form Submission
PHP code
$formvars ["username"] = "admin"; $formvars ["pwd"] = "admin"; $action = "http://www.taoav.com";//form submission address $snoopy->submit ($action, $formvars);//$formvars for the submitted array Echo $snoopy->results; Gets the returned results after the form is submitted $snoopy->submittext;//Submit only text that is stripped of HTML after submission $snoopy->submitlinks;//only return links after submission
? Now that you've submitted the form, you can do a lot of things. Next we'll disguise the IP, disguise the browser
3 Camouflage
PHP code
$formvars ["username"] = "admin"; $formvars ["pwd"] = "admin"; $action = "http://www.taoav.com"; Include "snoopy.php"; $snoopy = new Snoopy; $snoopy->cookies["PHPSESSID"] = ' fc106b1918bd522cc863f36890e6fff7 '; Camouflage SessionID $snoopy->agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98) "; Camouflage browser $snoopy->referer = "http://www.only4.cn";//Camouflage source page address Http_referer $snoopy->rawheaders["Pragma "] =" No-cache "; The HTTP header information for the cache $snoopy->rawheaders["x_forwarded_for"] = "127.0.0.101";//Spoofing IP $snoopy->submit ($ Action, $formvars);
?
Originally we can disguise the session disguise browser, camouflage IP, haha can do a lot of things.
For example, with a verification code, verify the IP vote, you can constantly cast.
PS: Here camouflage IP, in fact, is the disguise HTTP header, so the general REMOTE_ADDR obtained through the IP is not camouflage,
Instead, those that get IP via HTTP headers (which can prevent proxies) can make their own IP.
about how to verify the code, simply say:
First use the normal browser, view the page, find the SessionID corresponding to the verification code,
Also note the SessionID and verification code values,
Next, use Snoopy to forge.
Principle: Because it is the same SessionID, the verification code obtained is the same as the first input.
4 Sometimes we may need to forge more things, Snoopy completely for us to think of
PHP code
$snoopy->proxy_host = "www.only4.cn"; $snoopy->proxy_port = "8080"; Use proxy $snoopy->maxredirs = 2;//redirect $snoopy->expandlinks = true;//Whether the complete link is often used when collecting //e.g. linking to/ Images/taoav.gif can be changed to its full-link http://www.taoav.com/images/taoav.gif, this place can actually be in the final output when the Ereg_replace function to replace their own $snoopy- Maxframes = 5//allowable maximum number of frames //Note When grabbing the frame $snoopy->results returns an array $snoopy->error//Returns an error message
? The basic usage above is understood, and I'll show you the following example:
PHP code?
Agent = "mozilla/5.0 (Windows; U Windows NT 5.1; ZH-CN; rv:1.9.0.5) gecko/2008120122 firefox/3.0.5 firephp/0.2.1 ";//This is the browser information, in front of what browser you use to view the cookie, just use that browser information (ps:$_ Server can view information to the browser) $snoopy->referer = "http://bbs.phpchina.com/index.php"; $snoopy->expandlinks = true; $snoopy->rawheaders["COOKIE"]= "__utmz=17229162.1227682761.29.7.utmccn= (referral) |utmcsr=phpchina.com|utmcct =/html/index.html|utmcmd=referral; CDBPHPCHINA_SMILE=1D2D0D1; cdbphpchina_cookietime=2592000; __utma=233700831.1562900865.1227113506.1229613449.1231233266.16; __utmz=233700831.1231233266.16.8.utmccn= (referral) |utmcsr=localhost:8080|utmcct=/test3.php|utmcmd=referral; __utma=17229162.1877703507.1227113568.1231228465.1231233160.58; Uchome_loginuser=sinopf; xscdb_cookietime=2592000; __utmc=17229162; __utmb=17229162; CDBPHPCHINA_SID=EX5W1V; __utmc=233700831; cdbphpchina_visitedfid=17; CDBPHPCHINAO766UPYGK6OWZAYLVHSUZJIP22VPWEMGNPQAUWCFL9FD6CHP2E%2FKW0X4BKZ0N9LGK; Xscdb_auth=8106rayhkpql49ems%2fyhlbf3c6clz%2b2idSk4bexjwbqr%2bhszrvkgqpotthvr%2b6klpg3dtwptmui4ttqnnvpukuj6elm; cdbphpchina_onlineusernum=3721 "; $snoopy->fetch ("http://bbs.phpchina.com/forum-17-1.html"); $n =ereg_replace ("href=\", "href=\" http://bbs.phpchina.com/", $snoopy->results); Echo ereg_replace ("src=\", "src=\" http://bbs.phpchina.com/", $n); ?>
? This is the process of simulating landing Phpchina Forum, first of all to see your browser's letter
Interest: Echo?var_dump ($_server); This code can see its own browser information, put?
$_server[' Http_user_agent '] behind the contents of the copy down, stuck in $snoopy->agent place, and then is to see their own
Cookie, enter it in the browser address field after you have logged in to the forum with your own account.
Javascript:document.write (Document.cookie), enter, you can see your own cookie information, copy and paste
To $snoopy->rawheaders["COOKIE"]= behind. (My cookie information has been deleted for security reasons)
And then note:
# $n =ereg_replace ("href=\", "href=\" http://bbs.phpchina.com/", $snoopy->results);?
# echo ereg_replace ("src=\", "src=\" http://bbs.phpchina.com/", $n);
These two code, because the collected content all the HTML source code address is the relative link, therefore must replace the absolute link, thus may refer to the Forum picture and the CSS style.
Reprint: http://zzdboy1616.blog.163.com/blog/static/430670762009213111712876/?
?