Http://hi.baidu.com/jackywdx/blog/item/35bb33daedb414dfb7fd4867.html
LWP: Simple Function
1. How to use this module in Perl?
2. How to get a page content?
My $ content = get ('HTTP: // www.yahoo.com.cn '); |
The get function assigns all the page content obtained from www.yahoo.com.cn to the $ Content Variable. If the get fails, an UNDEF value is returned.
3. How to obtain the header )?
My (B, d, $ e) = header ('HTTP: // www.yahoo.com.cn '); |
If the header function is obtained successfully, five variables are returned. $ A-E indicates the content type, document length, Last Update Time, expiration time, and server name.
4. How to output the specified page content?
My $ code = getprint ('HTTP: // www.yahoo.com.cn '); |
Getprint will try to print the response.
5. How to save the obtained content to a file?
My $ code = getstore ('HTTP: // www.yahoo.com.cn ','/path/file.html '); |
Getstore will try to save the obtained content to the file specified by the second parameter, and return a status number in the form of the above.
6. How to synchronize remote and local files?
My $ code = mirror ('HTTP: // www.yahoo.com.cn ','/path/file.html '); |
The mirror function compares the consistency between the remote and local files, and returns a status number. For example, if the file is the same, 304 is returned. If the local file is synchronized successfully, 200 is returned.
7. How to test whether the returned status is correct?
Is_success ($ code) Is_error ($ code) |
The is_success and is_error functions can pass a status number as parameters, and the program will determine whether the returned result is successful. For example, is_success (403) returns false.
Ii. LWP: UserAgent Function
1. Get the page header information
#! /Usr/bin/perluse strict; Use warnings; Use lwp: useragent; my $ UA = lwp: useragent-> new; $ UA-> agent ('spider name '); My $ res = $ UA-> head ('HTTP: // www.yahoo.com.cn '); Foreach ($ res-> header_field_names) {print "$ _:", $ res-> header ($ _), "\ n ";} |
2. Get the specified page
My $ UA = lwp: useragent-> new; $ UA-> agent ('spider name '); My $ response = $ UA-> get ('HTTP: // www.yahoo.com.cn '); |
3. submit data in post Mode
Use strict; Use warnings; Use lwp 5.64; My $ browser = lwp: useragent-> New; my $ word = 'tarragon'; my $ url = 'HTTP: // www.altavista.com/sites/search/web '; My $ response = $ browser-> post ($ url, ['Q' => $ word, # the Altavista query string 'Pg '=> 'Q', 'avkw' => 'tgz', 'kl '=> 'XX ', ] ); Die "$ url error:", $ response-> status_line Unless $ response-> is_success; Die "Weird content type at $ url-", $ response-> content_type Unless $ response-> content_type eq 'text/html '; |
4. submit data through get
Use URI; My $ url = URI-> new ('HTTP: // us.imdb.com/Tsearch '); # Makes an object representing the URL $ url-> query_form (# And here the form data pairs: 'Title' => 'Blade Runner ', 'Restrict '=> 'movies and TV ', ); My $ response = $ browser-> get ($ url ); |
Iii. HTTPS support
The crypt: ssleay protocol must be installed to support encrypted transmission.
Installation in command line ppm:
Ppm> install http://theoryx5.uwinnipeg.ca/ppms/Crypt-SSLeay.ppd
Graphical installation:
Click Edit-> Preferences, Add Repository, and Add http://theoryx5.uwinnipeg.ca/ppms/as the installation source. Then select Crypt-SSLeay.
Test code:
Use strict;
Use warnings;
Use LWP: UserAgent; my $ url = 'https: // www. helsinki. fi/'; my $ ua = LWP: UserAgent-> new;
My $ response = $ ua-> get ($ url); $ response-> is_success or
Die "Failed to GET '$ url':", $ response-> status_line;
Print $ response-> as_string;