What is Snoopy? (Download Snoopy)
Snoopy is a PHP class that mimics the functionality of a Web browser, which accomplishes the task of getting web content and sending forms.
Some features of Snoopy:
* Easy to crawl the content of the webpage
* Easy to crawl Web page text content (remove HTML tags)
* Easy to crawl Web links
* Support Agent Host
* Support basic username/password Verification
* Support Settings User_agent, Referer (routing), cookies and header content (header file)
* Supports browser steering and can control steering depth
* Can expand the link in the Web page into a high-quality URL (default)
* Easy to submit data and get return value
* Support for tracking HTML frames (v0.92 added)
* Support for re-steering when transmitting cookies (v0.92 increase)
If you want to know more deeply, you Google it. Here are a few simple examples:
1 get the specified URL content
PHP code
Copy CodeThe code is as follows:
$url = "Http://www.jb51.net";
Include ("snoopy.php");
$snoopy = new Snoopy;
$snoopy->fetch ($url); Get all content
Echo $snoopy->results; Show results
$snoopy->fetchtext//Get text content (remove HTML code)
$snoopy->fetchlinks//Get Links
$snoopy->fetchform//Get form
2 form Submission
PHP code
Copy CodeThe code is as follows:
$formvars ["username"] = "admin";
$formvars ["pwd"] = "admin";
$action = "http://www.jb51.net";//form submission Address
$snoopy->submit ($action, $formvars);//$formvars for the submitted array
Echo $snoopy->results; Gets the returned result after the form is submitted
$snoopy->submittext; Only text that is stripped of HTML is returned after submission
$snoopy->submitlinks;//only return links after submission
Now that you've submitted the form, you can do a lot of things. Next we'll disguise the IP, disguise the browser
3 Camouflage
PHP code
Copy CodeThe code is as follows:
$formvars ["username"] = "admin";
$formvars ["pwd"] = "admin";
$action = "Http://www.jb51.net";
Include "snoopy.php";
$snoopy = new Snoopy;
$snoopy->cookies["PHPSESSID"] = ' fc106b1918bd522cc863f36890e6fff7 '; Camouflage SessionID
$snoopy->agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98) "; Disguise browser
$snoopy->referer = "Http://s.jb51.net"; Disguise source page address Http_referer
$snoopy->rawheaders["Pragma"] = "No-cache"; The HTTP header information for the cache
$snoopy->rawheaders["x_forwarded_for"] = "127.0.0.101"; Camouflage IP
$snoopy->submit ($action, $formvars);
Echo $snoopy->results;
Originally we can disguise the session disguise browser, camouflage IP, haha can do a lot of things.
For example, with a verification code, verify the IP vote, you can constantly cast.
PS: Here camouflage IP, in fact, is the disguise HTTP header, so the general REMOTE_ADDR obtained through the IP is not camouflage,
Instead, those that get IP via HTTP headers (which can prevent proxies) can make their own IP.
about how to verify the code, simply say:
First use the normal browser, view the page, find the SessionID corresponding to the verification code,
Also note the SessionID and verification code values,
Next, use Snoopy to forge.
Principle: Because it is the same SessionID, the verification code obtained is the same as the first input.
4 Sometimes we may need to forge more things, Snoopy completely for us to think of
PHP code
Copy CodeThe code is as follows:
$snoopy->proxy_host = "Www.jb51.net";
$snoopy->proxy_port = "8080"; Using proxies
$snoopy->maxredirs = 2; REDIRECT Times
$snoopy->expandlinks = true; Whether the complete link is often used when collecting
For example, the link for/images/taoav.gif can be changed to its full-link http://www.jb51.net/images/taoav.gif, this place can actually be in the final output when the Ereg_replace function to replace themselves
$snoopy->maxframes = 5//maximum number of frames allowed
Note that when the frame is crawled $snoopy->results returns an array
$snoopy->error//Return error message
The basic usage above is understood, and I'll show you the following example:
PHP code
Copy CodeThe code is as follows:
echo Var_dump ($_server);
Include ("Snoopy.class.php");
$snoopy = new Snoopy;
$snoopy->agent = "mozilla/5.0 (Windows; U Windows NT 5.1; zh-
CN; rv:1.9.0.5) gecko/2008120122 firefox/3.0.5 firephp/0.2.1 ";//This is a browser letter
Information (Ps:$_server can view the browser's information) by using the browser to view the cookie in front of you.
$snoopy->referer = "http://bbs.jb51.net/index.php";
$snoopy->expandlinks = true;
$snoopy->rawheaders["COOKIE"]= "__utmz=17229162.1227682761.29.7.utmccn= (referral) |utmcsr=jb51.net|utmcct=/ Html/index.html|utmcmd=referral; CDBPHPCHINA_SMILE=1D2D0D1; cdbphpchina_cookietime=2592000; __utma=233700831.1562900865.1227113506.1229613449.1231233266.16; __utmz=233700831.1231233266.16.8.utmccn= (referral) |utmcsr=localhost:8080|utmcct=/test3.php|utmcmd=referral; __utma=17229162.1877703507.1227113568.1231228465.1231233160.58; Uchome_loginuser=sinopf; xscdb_cookietime=2592000; __utmc=17229162; __utmb=17229162; CDBPHPCHINA_SID=EX5W1V; __utmc=233700831; cdbphpchina_visitedfid=17; CDBPHPCHINAO766UPYGK6OWZAYLVHSUZJIP22VPWEMGNPQAUWCFL9FD6CHP2E%2FKW0X4BKZ0N9LGK; Xscdb_auth=8106rayhkpql49ems%2fyhlbf3c6clz%2b2idsk4bexjwbqr%2bhszrvkgqpotthvr%2b6klpg3dtwptmui4ttqnnvpukuj6elm ; cdbphpchina_onlineusernum=3721 ";
$snoopy->fetch ("http://bbs.jb51.net");
$n =ereg_replace ("href=\", "href=\" http://bbs.jb51.net/", $snoopy->results);
Echo ereg_replace ("src=\", "src=\" http://bbs.jb51.net/", $n);
?>
This is the process of simulating landing Phpchina Forum, first of all to see your browser's letter
Note: Echo var_dump ($_server); This code can see your browser's information,
$_server[' Http_user_agent '] behind the contents of the copy down, stuck in $snoopy->agent place, and then is to see their own
Cookie, enter it in the browser address field after you have logged in to the forum with your own account.
Javascript:document.write (Document.cookie), enter, you can see your own cookie information, copy and paste
To $snoopy->rawheaders["COOKIE"]= behind. (My cookie information has been deleted for security reasons)
And then note:
# $n =ereg_replace ("href=\", "href=\" http://bbs.jb51.net/", $snoopy->results);
# echo ereg_replace ("src=\", "src=\" http://bbs.jb51.net/", $n);
These two code, because the collected content all the HTML source code address is the relative link, therefore must replace the absolute link, thus may refer to the Forum picture and the CSS style.
http://www.bkjia.com/PHPjc/323716.html www.bkjia.com true http://www.bkjia.com/PHPjc/323716.html techarticle What is snoopy? (download Snoopy) Snoopy is a PHP class that mimics the functionality of a Web browser that accomplishes the task of getting web content and sending forms. Some features of Snoopy: * Fang ...