Python implementation detects if proxy IP can turn over the wall

Last Update:2016-06-10 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

That wall is really hateful! In the It circle, often need to use GG data (you can also use to access the 1024x768, ^_^ ... ）。 Of course, you can also use Baidu. In fact, it is not that I do not love Baidu, there is a reason, and listen to my thin way. Once had the egg ache, wanted to see if someone would copy my blog (although the blog did not learn well), so Baidu a bit, the results are amazing. I found myself writing a blog, even with the whole title to search, often can't search, search is a bunch of crawler crawl results. Specifically what, here do not say, each can take their own blog to try. Before always manually collect several IP for a period of time, failed to re-collect a few later, so repeated, annoying! So, want to write a crawler crawling proxy IP, and then every time directly in the database to find a few out of the line. However, many of the bots crawling over the IP have failed. This is also reduced to manual testing, this is not to add more trouble for themselves? So write a detection agent IP is available program, let the program to help me detect. So every time I can get the available proxy IP. As the crawler is written in Scrapy, in order to facilitate maintenance, IP detection as a part of the Scrapy crawler is good. Therefore, the following procedures for testing:

1. Create File: checkproxy.py

#coding =utf-8 Import urllib2import urllibimport timeimport socketip_check_url = ' http://www.google.com.hk/' user_agent = ' mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) gecko/20100101 firefox/12.0 ' socket_timeout = # Check proxydef check_proxy (Protocol, PIP): Try:proxy_hand Ler = Urllib2. Proxyhandler ({Protocol:pip}) opener = Urllib2.build_opener (proxy_handler) # opener.addheaders = [(' User-agent ', User    _agent)] #这句加上以后无法正常检测, I don't know what the reason is. Urllib2.install_opener (opener) req = Urllib2.     Request (ip_check_url) Time_start = Time.time () conn = Urllib2.urlopen (req) # conn = Urllib2.urlopen (Ip_check_url) Time_end = Time.time () Detected_pip = Conn.read () proxy_detected = True except Urllib2. Httperror, E:print "Error:code", E.code return False except Exception, Detail:print "ERROR:", Detail re Turn False return proxy_detecteddef main (): Socket.setdefaulttimeout (socket_timeout) Print protocol = "HTTP" Curr  Ent_proxy = "212.82.126.32:80" proxy_detected = check_proxy (protocol, Current_proxy) if Proxy_detected:print ("Working:" + current_proxy) Else: Print "FAILED:%s"% (Current_proxy,) if __name__ = = ' __main__ ': Main ()

2. Test:

[Root@bogon proxyipspider]# python checkproxy.py  working:212.82.126.32:80

Of course, this is just a prototype of the program, the actual detection of the program needs to be combined with database or file operations to complete. Proxy IP detected, then the rest is set up. After setting up, enjoy the GG bar. 1024 you want to watch as long as you like, but still do not look more good, you understand. If you want to put on Facebook, the oil turtle, and Twitter, it's up to you, and this is just GG.

Program APE, always want to use their own hands to solve the problem. The change of the world's heart has not changed, just like the slogan of the blog Park, "Code changes the universe." If you see something uncomfortable, build one yourself. It is such a lot of examples, daily use of VI, GitHub and so on. All right, here we go, 1024.

That wall is really hateful!



This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python implementation detects if proxy IP can turn over the wall

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python implementation detects if proxy IP can turn over the wall

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support