Use Snoopy, a powerful PHP collection tool. What is Snoopy? (Download snoopy) Snoopy is a php class used to imitate the functions of a web browser. it can complete the task of obtaining webpage content and sending forms. Some features of Snoopy: * What is party Snoopy? (Download snoopy)
Snoopy is a php class used to imitate the functions of a web browser. it can obtain webpage content and send forms.
Some features of Snoopy:
* Convenient webpage content capture
* Convenient crawling of webpage text (HTML tag removal)
* Convenient webpage crawling
* Proxy host supported
* Basic user name/password verification is supported.
* User_agent, referer, cookies, and header content can be set)
* Supports browser redirection and controls steering depth
* Extends links in a webpage to high-quality URLs (default)
* Easy to submit data and obtain returned values
* Supports tracking HTML frameworks (added in v0.92)
* Supports transmitting cookies during redirection (added in v0.92)
If you want to know more deeply, Google it yourself. Here are a few simple examples:
1. get the specified url content
PHP code
The code is as follows:
$ Url = "http://www.jb51.net ";
Include ("snoopy. php ");
$ Snoopy = new Snoopy;
$ Snoopy-> fetch ($ url); // get all content
Echo $ snoopy-> results; // display the result
$ Snoopy-> fetchtext // get text content (remove html code)
$ Snoopy-> fetchlinks // Obtain the link
$ Snoopy-> fetchform // Obtain the form
2. form submission
PHP code
The code is as follows:
$ Formvars ["username"] = "admin ";
$ Formvars ["pwd"] = "admin ";
$ Action = "http://www.jb51.net"; // form submission address
$ Snoopy-> submit ($ action, $ formvars); // $ formvars is the submitted array
Echo $ snoopy-> results; // obtain the result returned after the form is submitted.
$ Snoopy-> submittext; // after submission, only the html-removed text is returned.
$ Snoopy-> submitlinks; // after submission, only the link is returned.
Since the form has been submitted, we can do a lot of things. next we will disguise the ip address and the browser.
3 disguise
PHP code
The code is as follows:
$ Formvars ["username"] = "admin ";
$ Formvars ["pwd"] = "admin ";
$ Action = "http://www.jb51.net ";
Include "snoopy. php ";
$ Snoopy = new Snoopy;
$ Snoopy-> cookies ["PHPSESSID"] = 'fc0000b1918bd522cc863f000090e6fff7 '; // disguise sessionid
$ Snoopy-> agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98)"; // camouflage browser
$ Snoopy-> referer = "http://s.jb51.net"; // camouflage source page address http_referer
$ Snoopy-> rawheaders ["Pragma"] = "no-cache"; // The http header information of the cache
$ Snoopy-> rawheaders ["X_FORWARDED_FOR"] = "127.0.0.101"; // disguise ip address
$ Snoopy-> submit ($ action, $ formvars );
Echo $ snoopy-> results;
In the past, we could disguise session as a web browser and ip address, and haha could do a lot of things.
For example, you can vote for an ip address with a verification code.
Ps: Here, the disguised ip address is actually an http header, so the ip address obtained through REMOTE_ADDR cannot be disguised,
Instead, ip addresses obtained through http headers (which can prevent proxies) can be created by themselves.
Let's briefly describe how to use the verification code:
First, use a normal browser to view the page and find the sessionid corresponding to the verification code,
Write down sessionid and verification code value at the same time,
Next, we will use snoopy to forge.
Principle: because it is the same sessionid, the verification code obtained is the same as the one entered for the first time.
4. sometimes we may need to forge more things. snoopy thought of it for us.
PHP code
The code is as follows:
$ Snoopy-> proxy_host = "www.jb51.net ";
$ Snoopy-> proxy_port = "8080"; // use a proxy
$ Snoopy-> maxredirs = 2; // redirect times
$ Snoopy-> expandlinks = true; // whether to complete the link is often used during Collection
// For example, if the link is/images/taoav.gif, you can change it to its full-link attachment.
$ Snoopy-> maxframes = 5 // maximum number of frames allowed
// When capturing the frame, $ snoopy-> results returns an array.
$ Snoopy-> error // error message returned
The above basic usage is understood, and I will demonstrate it once below:
PHP code
The code is as follows:
// Echo var_dump ($ _ SERVER );
Include ("Snoopy. class. php ");
$ Snoopy = new Snoopy;
$ Snoopy-> agent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-
CN; rv: 1.9.0.5) Gecko/2008120122 Firefox/3.0.5 FirePHP/0.2.1 "; // This is a browser Mail
You can view the cookie in the browser (ps: $ _ SERVER can view the information in the browser)
$ Snoopy-> referer = "http://bbs.jb51.net/index.php ";
$ Snoopy-> expandlinks = true;
$ Snoopy-> rawheaders ["COOKIE"] = "_ utmz = response = (referral) | utmcsr = jb51.net | utmcct =/html/index.html | utmcmd = referral; cdbphpchina_smile = 1D2D0D1; records = 2592000; _ utma = records; _ utmz = records = (referral) | utmcsr = localhost: 8080 | utmcct =/test3.php | utmcmd = referral; _ utma = bytes; uchome_loginuser = sinopf; xscdb_cookietime = 2592000; _ utmc = 17229162; _ utmb = 17229162; cdbphpchina_sid = EX5w1V; _ utmc = 233700831; bytes = 17; authorization % authorization; xscdb_auth = 8127rayhkpql49ems % 2FyhLBf3C6ClZ % 2B2idSk4bExJwbQr % 2 BHSZrVKgqPOttHVr % authorization; cdbphpchina_onlineusernum = 3721 ";
$ Snoopy-> fetch ("http://bbs.jb51.net ");
$ N = ereg_replace ("href = \" "," href = \ "http://bbs.jb51.net/", $ snoopy-> results );
Echo ereg_replace ("src = \" "," src = \ "http://bbs.jb51.net/", $ n );
?>
This is the process of simulating login to the PHPCHINA Forum. First, you must view your browser emails.
Information: echo var_dump ($ _ SERVER); this code can view your browser information
Copy the content behind $ _ SERVER ['http _ USER_AGENT '] and stick it to $ snoopy-> agent. then you need to check your own
COOKIE. after logging on to the Forum with your own account, enter
Javascript: document. write (document. cookie). press enter to view your cookie information and copy and paste it.
To the back of $ snoopy-> rawheaders ["COOKIE"] =. (My cookie information has been deleted for security reasons)
Then pay attention:
# $ N = ereg_replace ("href = \" "," href = \ "http://bbs.jb51.net/", $ snoopy-> results );
# Echo ereg_replace ("src = \" "," src = \ "http://bbs.jb51.net/", $ n );
Because all the HTML source code addresses of the collected content are relative links, replace them with absolute links so that you can reference the Forum images and css styles.
Why? (Download snoopy) Snoopy is a php class used to imitate the functions of a web browser. it can complete the task of obtaining webpage content and sending forms. Some features of Snoopy...