Introduction to php snoopy collection

Source: Internet
Author: User
Tags php session

Snoopy is a php class used to simulate some simple functions of the browser. It can obtain webpage content and send forms. To run Snoopy correctly, the PHP version of your server is later than 4 and PCRE (Perl Compatible Regular Expressions) is supported. Basic LAMP services are supported. Because it is a php class and does not need to be expanded, it is the best choice when the server does not support curl.

Snoopy features:

1. fetch the webpage content

2. fetchtext

3. Capture the link of the web page, form fetchlinks fetchform

4. Support for proxy hosts

5. basic user name/password verification is supported.

6. You can set user_agent, referer, cookies, and header content)

7. Supports browser redirection and can control the depth of redirection.

8. Extend the link in the webpage to a high-quality url (default)

9. submit data and obtain the returned value

10. Support tracking HTML framework

11. Transfer cookies when redirection is supported

Snoopy class: http://sourceforge.net/projects/snoopy/

Snoopy class method:

Fetch ($ URI)

This method is used to capture the content of a webpage. $ URI is the URL of the webpage to be crawled. The captured results are stored in $ this-> results. If you are capturing a framework, Snoopy will track each frame and store it in an array, and then save it to $ this-> results.

Fetchtext ($ URI)

This method is similar to fetch (). The only difference is that this method will remove HTML tags and other irrelevant data and only return the text content in the webpage.

Fetchform ($ URI)

This method is similar to fetch (). The only difference is that this method will remove the HTML Tag and other irrelevant data and only return the form Content (form) in the webpage ).

Fetchlinks ($ URI)

This method is similar to fetch (). The only difference is that this method will remove HTML tags and other irrelevant data and only return links in the webpage ). By default, the relative link is automatically completed and converted to a complete URL.

Submit ($ URI, $ formvars)

This method sends a confirmation form to the URL specified by $ URL. $ Formvars is an array that stores form parameters.

Submittext ($ URI, $ formvars)

This method is similar to submit (). The only difference is that this method will remove HTML tags and other irrelevant data and only return the text content on the webpage after login.

Submitlinks ($ URI)

This method is similar to submit (). The only difference is that this method will remove HTML tags and other irrelevant data and only return links in the webpage ). By default, the relative link is automatically completed and converted to a complete URL.

Snoopy attributes: (the default value is in brackets)

$ Host connected host
$ Port connection port
$ Proxy_host: the proxy host used, if any
$ Proxy_port indicates the proxy host port used. If yes
$ Agent user proxy disguise (Snoopy v0.1)
$ Referer information, if any
$ Cookies, if any
$ Rawheaders other header information, if any
$ Maxredirs maximum redirect times, 0 = not allowed (5)
$ Offsiteok whether or not to allow redirects off-site. (true)
$ Expandlinks: whether to add all links to the full address (true)
$ User authentication username, if any
$ Pass authentication username, if any
$ Accept http accept type (image/gif, image/x-xbitmap, image/jpeg, image/pjpeg ,*/*)
$ Error: Where is the error reported? If yes
$ Response_code response code returned from the server
$ Headers header information returned from the server
$ Maxlength: Maximum length of returned data
$ Read_timeout read operation timeout (requires PHP 4 Beta 4 +)
Set 0 to no timeout
$ Timed_out if a read operation times out, this attribute returns true (requires PHP 4 Beta 4 +)
$ Maxframes maximum number of frames that can be tracked
$ Status indicates the http status captured.
$ Temp_dir temporary file directory (/tmp) that can be written by the webpage Server)
$ Curl_path cURL binary directory. If no cURL binary is available, set it to false.

Snoopy example:

(1) Get the content of a specified url

$ Url = 'HTTP: // www.phpernote.com '; include ('snoopy. php '); $ snoopy = new Snoopy; $ snoopy-> fetch ($ url); // obtain all content echo $ snoopy-> results; // display the result $ snoopy-> fetchtext // get the text content (remove the html code) $ snoopy-> fetchlinks // obtain all links on the page $ snoopy-> fetchform // obtain page form information

(2) submit a form

Include 'snoopy. php '; $ snoopy = new Snoopy; $ formvars ['username'] = 'admin'; $ formvars ['pwd'] = 'admin'; $ action = 'HTTP: // www.phpernote.com '; // form submission address $ snoopy-> submit ($ action, $ formvars); // $ formvars is the submitted array echo $ snoopy-> results; // obtain the result returned after the form is submitted $ snoopy-> submittext; // After the form is submitted, only the html-removed text $ snoopy-> submitlinks is returned. // After the form is submitted, only the link is returned.

(3) Use Snoopy for disguise

$ Formvars ['username'] = 'admin'; $ formvars ['pwd'] = 'admin'; $ action = 'HTTP: // www.phpernote.com '; include 'snoopy. php '; $ snoopy = new Snoopy; $ snoopy-> cookies ['phpsessid'] = 'fc206b1918bd522cc863p000090e6notef7 '; // disguise sessionid $ snoopy-> agent =' (compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98) '; // camouflage browser $ snoopy-> referer = 'HTTP: // www.phpernote.com '; // disguise the Source Page address http_referer $ snoopy-> rawheaders ['pragm'] = 'no-cache '; // cache http header information $ snoopy-> rawheaders ['x _ FORWARDED_FOR '] = '2017. 0.0.1 '; // disguised ip $ snoopy-> submit ($ action, $ formvars); echo $ snoopy-> results;
Articles you may be interested in
  • Php prompts PHP Warning: date (): It is not safe to rely on the... solution to the error
  • Php simple weight calculation method (suitable for lottery applications)
  • Differences between php session and cookie
  • Powerful PHP image processing class (watermark, transparency, zoom, sharpen, rotate, flip, cut, reversed)
  • Insert and Update statements used by PHP beginners
  • Php bom removal tool, php batch removal of bom code
  • PHP Curl batch multi-threaded open URL class
  • PhpMyAdmin Cannot start session without errors error Solution

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.