What is Snoopy? (Download Snoopy)
Snoopy is a PHP class that mimics the functionality of a Web browser, which completes the task of getting web content and sending forms.
Some features of Snoopy:
* Easy to crawl the content of the Web page
* Easy to crawl the text content of the page (remove HTML tags)
* Easy to crawl Web links
* Support Agent Host
* Support for basic username/password Authentication
* Support Setup User_agent, Referer (routing), cookies and header content (header file)
* Support Browser steering, and can control the steering depth
* Can extend the link in the webpage to the High quality URL (default)
* Easy to submit data and get return value
* Support for tracking HTML frames (v0.92 added)
* Transfer of Cookies (v0.92 added) when supporting and turning
If you want to know more deeply, you Google it yourself. Here are a few simple examples:
1 get the specified URL content
PHP code
Copy CodeThe code is as follows:
$url = "Http://www.jb51.net";
Include ("snoopy.php");
$snoopy = new Snoopy;
$snoopy->fetch ($url); Get all content
Echo $snoopy->results; Show results
$snoopy->fetchtext//Get text content (remove HTML code)
$snoopy->fetchlinks//Get Links
$snoopy->fetchform//Get the form
2 form Submission
PHP code
Copy CodeThe code is as follows:
$formvars ["username"] = "admin";
$formvars ["pwd"] = "admin";
$action = "http://www.jb51.net";//Form Submit address
$snoopy->submit ($action, $formvars);//$formvars for the submitted array
Echo $snoopy->results; Get the results of a return after a form is submitted
$snoopy->submittext; Only text that is stripped of HTML is returned after submission
Only return link after $snoopy->submitlinks;//commit
Now that you've submitted a form, you can do a lot of things. Next we're going to disguise the IP, camouflage browser
3 Camouflage
PHP code
Copy CodeThe code is as follows:
$formvars ["username"] = "admin";
$formvars ["pwd"] = "admin";
$action = "Http://www.jb51.net";
Include "snoopy.php";
$snoopy = new Snoopy;
$snoopy->cookies["PHPSESSID"] = ' fc106b1918bd522cc863f36890e6fff7 '; Camouflage SessionID
$snoopy->agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98) "; Camouflage browser
$snoopy->referer = "Http://s.jb51.net"; Camouflage Source page Address Http_referer
$snoopy->rawheaders["Pragma"] = "No-cache"; Cache HTTP Header Information
$snoopy->rawheaders["x_forwarded_for"] = "127.0.0.101"; Camouflage IP
$snoopy->submit ($action, $formvars);
Echo $snoopy->results;
Originally we can disguise the session camouflage browser, camouflage IP, haha can do a lot of things.
For example, with verification code, verify IP voting, you can keep casting.
PS: Here camouflage IP, in fact, is the camouflage HTTP head, so the general through the REMOTE_ADDR to obtain IP is not disguised,
Instead, those who get IP through HTTP headers (which can prevent proxies) can make their own IP.
about how to verify the code, simply:
First use the normal browser, view the page, find the corresponding SessionID code,
Also note the SessionID and the Verification code values,
Next, use Snoopy to forge.
Principle: Because it is the same SessionID, the verification code obtained is the same as the first time input.
4 Sometimes we may need to forge more stuff, Snoopy completely for us to think of
PHP code
Copy CodeThe code is as follows:
$snoopy->proxy_host = "Www.jb51.net";
$snoopy->proxy_port = "8080"; Using agents
$snoopy->maxredirs = 2; Number of redirects
$snoopy->expandlinks = true; Whether the full link in the collection of time often used
For example, the link to/images/taoav.gif can be changed to its full link http://www.jb51.net/images/taoav.gif, this place can actually in the final output when the Ereg_replace function to replace
$snoopy->maxframes = 5//maximum number of frames allowed
Note that when you crawl the frame $snoopy->results returns an array
$snoopy->error//Return error message
For the basic usage above, I'll show you the following example:
PHP code
Copy CodeThe code is as follows:
?
echo Var_dump ($_server);
Include ("Snoopy.class.php");
$snoopy = new Snoopy;
$snoopy->agent = "mozilla/5.0 (Windows; U Windows NT 5.1; zh-
CN; rv:1.9.0.5) gecko/2008120122 firefox/3.0.5 firephp/0.2.1 ";//This is a browser letter.
, in front of which browser you use to view cookies, use that browser information (Ps:$_server can view the browser information)
$snoopy->referer = "http://bbs.jb51.net/index.php";
$snoopy->expandlinks = true;
$snoopy->rawheaders["COOKIE"]= "__utmz=17229162.1227682761.29.7.utmccn= (referral) utmcsr=jb51.netutmcct=/html /index.htmlutmcmd=referral; CDBPHPCHINA_SMILE=1D2D0D1; cdbphpchina_cookietime=2592000; __utma=233700831.1562900865.1227113506.1229613449.1231233266.16; __utmz=233700831.1231233266.16.8.utmccn= (referral) utmcsr=localhost:8080utmcct=/test3.phputmcmd=referral; __utma=17229162.1877703507.1227113568.1231228465.1231233160.58; Uchome_loginuser=sinopf; xscdb_cookietime=2592000; __utmc=17229162; __utmb=17229162; CDBPHPCHINA_SID=EX5W1V; __utmc=233700831; cdbphpchina_visitedfid=17; CDBPHPCHINAO766UPYGK6OWZAYLVHSUZJIP22VPWEMGNPQAUWCFL9FD6CHP2E%2FKW0X4BKZ0N9LGK; Xscdb_auth=8106rayhkpql49ems%2fyhlbf3c6clz%2b2idsk4bexjwbqr%2bhszrvkgqpotthvr%2b6klpg3dtwptmui4ttqnnvpukuj6elm ; cdbphpchina_onlineusernum=3721 ";
$snoopy->fetch ("http://bbs.jb51.net");
$n =ereg_replace ("href=\" "," href=\ "http://bbs.jb51.net/", $snoopy->results);
Echo ereg_replace ("src=\" "," src=\ "http://bbs.jb51.net/", $n);
?>
This is the simulation landing Phpchina Forum process, first of all to view their browser's letter
Breath: Echo var_dump ($_server); This code can see its own browser information, the
$_server[' http_user_agent '] the contents of the following copy down, glued to the $snoopy->agent place, and then just to see their own
Cookies, with their own in the Forum account login forum, in the browser address bar input
Javascript:document.write (Document.cookie), enter, you can see your own cookie information, copy and paste
To the back of the $snoopy->rawheaders["COOKIE"]=. (My cookie information has been deleted for security purposes)
and then notice:
# $n =ereg_replace ("href=\" "," href=\ "http://bbs.jb51.net/", $snoopy->results);
# echo ereg_replace ("src=\" "," src=\ "http://bbs.jb51.net/", $n);
These two lines of code, because all the content of the HTML source address is a relative link, so replace the absolute link, so you can refer to the forum's pictures and CSS style.