This is a creation in Article, where the information may have evolved or changed.
Because the crawler is too frequent access to a site and banned IP, just because the craving for a moment and forget to use the agent, this is probably the most crawler beginners encounter problems it. But some sites are not just crawlers that need to be accessed, but are also accessible to people. This is when you need to use a proxy server to access it. However, I do not have an available proxy pool at hand, but I have a cloud server. Decide to build an agent on the cloud server.
It is very convenient to write with Golang, a word: cool.
Package Mainimport ("net/http" Log "Github.com/sirupsen/logrus" "Io/ioutil" "IO" "OS") Func Handler (w http. Responsewriter, R *http. Request) {res, err: = http. Defaultclient.do (R) defer Res. Body.close () if err! = Nil {log. PANICLN (Err. Error ())}for k, V: = Range Res. Header {For _, VV: = Range v {w.header (). Add (k, vv)}}for _, c: = Range res. Cookies () {W.header (). ADD ("Set-cookie", C.raw)}w.writeheader (res. StatusCode) result, err: = Ioutil. ReadAll (Res. Body) If err! = Nil && Err! = Io. EOF {log. PANICLN (Err. Error ())}w.write (Result)}func Main () {http. Handlefunc ("/", Handler) Log.infoln ("Starting Agent:", OS.) ARGS[1]) http. Listenandserve (":" +os. ARGS[1], Nil)}
Compile and upload to the server to run, while setting up website usage proxy on the browser. I am using a Chrome browser, so I can set the rules in the plugin switchysharp.