Summary of the methods for simulating HttpRequest in Python, pythonhttprequest

Source: Internet
Author: User

Summary of the methods for simulating HttpRequest in Python, pythonhttprequest

Python can be said to be a powerful tool for network crawling. This article mainly introduces some methods and techniques for simulating http requests using python.

Python processes two request class libraries: urllib and urllib2. These two libraries are not two different versions of a class library. urllib is mainly used to process url-related content. When sending a request, the request object can only be a url. Urllib2 can use a request object to implement a request, such as forging a header, setting a proxy, http get, http post, and other methods.

Read this article to learn about the basic knowledge of http requests, such:

  • What is httpwebrequest and httpwebresponse?
  • What is get and post?
  • What is cookie?

This article describes the methods used to simulate requests:

  • Set proxy
  • Counterfeit Header or Header information
  • Enable cookie
  • Processing url parameters
Use urllib2.urlopen to send messages directly
Import urllib2url = 'HTTP: // www.baidu.com/'response = urllib2.urlopen (url) # urlopen accepts the input parameter either string or requestresponse_text = response. read ()

 

Use urllib. build_opener to send requests directly
import urllib2url = 'http://www.baidu.com/'opener = urllib2.build_opener()response = opener.open(url)response_text = response.read()
Access the site through proxy
proxy_handler = urllib2.ProxyHandler({"http" : 'http://localhost:8888'})opener = urllib2.build_opener(proxy_handler)response = opener.open(url)response_text = response.read()
Request body (http post) attached to the request)
opener = urllib2.build_opener()response = opener.open(url,'request body')response_text = response.read()

If the body is in the key-value format, you can refer to the url Processing Section below for processing.

Enable Cookie
cookie = cookielib.CookieJar()opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookie))response = opener.open(url)response_text = response.read()
Use urllib2.Request to add custom Header information
request = urllib2.Request(url)request.add_data('1234567')request.add_header('User-Agent', 'fake-client')response = urllib2.urlopen(request)
Process parameter information in a url

Whether using the get or post method, parameters are often used. You can use the following class library to process parameters.

Convert the parameter set to string
para = {'111':'222','aaa':'bbb'}encodeurl = urllib.urlencode(para)
Output AAAS = bbb & 111 = 222
Convert url parameters to dictionary
url = 'https://www.baidu.com/s?wd=python%20url%20querystring&pn=10&oq=python%20url%20querystring&tn=baiduhome_pg&ie=utf-8&usm=1&rsv_idx=2&rsv_pq=d09af93600035cb8&rsv_t=d151qRmNNdybGINHcKbyO360E2%2Fg%2FUs2t0MiKqRQXwhHZuNF3IlKyyStzYuofVZczQA3'splitresult_instance = urlparse.urlsplit(url)

 

Output object:

SplitResult (scheme = 'https', netloc = 'www .baidu.com ', path ='/s ', query = 'wd = python % 20url % 20 querystring & pn = 10 & oq = python % 20url % 20 querystring & tn = baiduhome_pg & ie = UTF-8 & usm = 1 & rsv_idx = 2 & rsv_pq = d09af93600035cb8 & rsv_t = d151qRmNNdybGINHcKbyO360E2 % 2Fg % 2FUs2t0MiKqRQXwhHZuNF3IlKyyStzYuofVZczQA3 ', fragment = '')

If you want to convert to a set

result_dic=urlparse.parse_qs(splitresult.query)

In this way, put the data information in the url to implement http get and put it in the body to implement http post.

 

This article is also hosted on a http://simmon.club/blog/Python-HttpRequest/

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.