1. Proxy Server:
A server in the middle of the client and the Internet, if you use a proxy server, when we browse the information, we first make a request to the proxy server, and then the proxy server to the Internet to obtain information, and then return to us.
2. Code:
Importurllib.request#proxy_addr= "117.36.103.170:8118", which is the IP and port of the proxy server#URL is the address to crawl data fromdefUse_proxy (url,proxy_addr):#Use the Proxyhandler function to set the proxy server, the function parameter is a dictionary, the Dictionary key is "http", the value is the proxy server IP address, IP address and port can be found in the www.xicidaili.com. Proxy=urllib.request.proxyhandler ({"http":p roxy_addr}) #the first parameter that establishes Opener,bulid_opener is the proxy, the second parameter is fixed to Urllib.request.HTTPHandlerOpener=Urllib.request.build_opener (Proxy,urllib.request.httphandler)#With opener set to global, the following operation can be applied to the opener. Urllib.request.install_opener (opener) Data=urllib.request.urlopen (URL). read (). Decode ("Utf-8","Ignore") returndataproxy_addr="125.118.79.44:6666"URL="http://www.baidu.com"Data=use_proxy (URL,PROXY_ADDR)Print(Len (data))
Python crawler 2------The proxy server in the crawler blocking means combat