This article mainly introduces the use of Python http, HTTPS Proxy example of the explanation, has a certain reference value, now share to everyone, the need for friends can refer to
When using Python to crawl data from the Internet in the country, some websites or API interfaces are limited or blocked, the use of agents can speed up the crawl process, reduce the request failed, the Python program uses a proxy method mainly the following:
(1) If you are using some network libraries or crawler frameworks for data crawling in your code, this framework generally supports setting up proxies, for example:
<span style= "FONT-SIZE:14PX;" >import Urllib.request as Urlreq # set HTTPS proxy ph = urlreq. Proxyhandler ({' https ': ' https://127.0.0.1:1080 '}) oper = Urlreq.build_opener (ph) # installs the agent into the global environment so that all requests automatically use the proxy Urlreq.install_opener (oper) res = Oper.open ("https://www.google.com") print (Res.read ()) </span>
<span style= "FONT-SIZE:14PX;" >import requests as Req print (Req.get ("https://www.google.com", proxies={' https ': ' https://127.0.0.1:1080 '}). Content) </span>
(2) if the library used does not provide an interface to set up the proxy, but the underlying use of Urllib, requests and other libraries, you can try to set http_proxy and https_proxy environment variables, the common network library will automatically recognize these environment variables, The proxy initiating request using the variable settings is set up as follows:
Import os os.environ[' http_proxy ' = ' http://127.0.0.1:1080 ' os.environ[' https_proxy '] = ' https://127.0.0.1:1080 '
(3) if the above two methods are not used, then you can also use some of the tools can listen, intercept and modify the network package and Couroux (Fiddler, Mitmproxy) to intercept the HTTP request packet and modify the address, to achieve the effect of using the proxy.