Python implementation detects if proxy IP can turn over the wall

Source: Internet
Author: User
That wall is really hateful! In the It circle, often need to use GG data (you can also use to access the 1024x768, ^_^ ... )。 Of course, you can also use Baidu. In fact, it is not that I do not love Baidu, there is a reason, and listen to my thin way. Once had the egg ache, wanted to see if someone would copy my blog (although the blog did not learn well), so Baidu a bit, the results are amazing. I found myself writing a blog, even with the whole title to search, often can't search, search is a bunch of crawler crawl results. Specifically what, here do not say, each can take their own blog to try. Before always manually collect several IP for a period of time, failed to re-collect a few later, so repeated, annoying! So, want to write a crawler crawling proxy IP, and then every time directly in the database to find a few out of the line. However, many of the bots crawling over the IP have failed. This is also reduced to manual testing, this is not to add more trouble for themselves? So write a detection agent IP is available program, let the program to help me detect. So every time I can get the available proxy IP. As the crawler is written in Scrapy, in order to facilitate maintenance, IP detection as a part of the Scrapy crawler is good. Therefore, the following procedures for testing:

1. Create File: checkproxy.py

#coding =utf-8 Import urllib2import urllibimport timeimport socketip_check_url = ' http://www.google.com.hk/' user_agent = ' mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) gecko/20100101 firefox/12.0 ' socket_timeout = # Check proxydef check_proxy (Protocol, PIP): Try:proxy_hand Ler = Urllib2. Proxyhandler ({Protocol:pip}) opener = Urllib2.build_opener (proxy_handler) # opener.addheaders = [(' User-agent ', User    _agent)] #这句加上以后无法正常检测, I don't know what the reason is. Urllib2.install_opener (opener) req = Urllib2.     Request (ip_check_url) Time_start = Time.time () conn = Urllib2.urlopen (req) # conn = Urllib2.urlopen (Ip_check_url) Time_end = Time.time () Detected_pip = Conn.read () proxy_detected = True except Urllib2. Httperror, E:print "Error:code", E.code return False except Exception, Detail:print "ERROR:", Detail re Turn False return proxy_detecteddef main (): Socket.setdefaulttimeout (socket_timeout) Print protocol = "HTTP" Curr  Ent_proxy = "212.82.126.32:80" proxy_detected = check_proxy (protocol, Current_proxy) if Proxy_detected:print ("Working:" + current_proxy) Else: Print "FAILED:%s"% (Current_proxy,) if __name__ = = ' __main__ ': Main ()

2. Test:

[Root@bogon proxyipspider]# python checkproxy.py  working:212.82.126.32:80


Of course, this is just a prototype of the program, the actual detection of the program needs to be combined with database or file operations to complete. Proxy IP detected, then the rest is set up. After setting up, enjoy the GG bar. 1024 you want to watch as long as you like, but still do not look more good, you understand. If you want to put on Facebook, the oil turtle, and Twitter, it's up to you, and this is just GG.

Program APE, always want to use their own hands to solve the problem. The change of the world's heart has not changed, just like the slogan of the blog Park, "Code changes the universe." If you see something uncomfortable, build one yourself. It is such a lot of examples, daily use of VI, GitHub and so on. All right, here we go, 1024.

That wall is really hateful!

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.