Python brushes web clicks via proxy

Source: Internet
Author: User

python brushes web clicks via proxy

Update exception handling conditions

@time 2013-0803 Update Cycle counting problems and random wait time issues

#!/usr/bin/python
#-*-Coding:utf-8-*-
‘‘‘
This script mainly implements the page click, in addition to implementing the sub-function point, there are three knowledge points:
1, randomly get proxy IP, through proxy IP access to the designated site, the purpose is to prevent IP is blocked
2, visit a page, random rest a few seconds, re-visit, the purpose is to prevent the site in front of the 4-7-layer filter device interception
3. Modify the User Agent field for HTTP, and some websites and 4-7-tier devices will check
Created on 2013-7-14
@author: QQ136354553
‘‘‘


Import urllib2,re,time,urllib,proxyip,random,user_agents


def gethtml (URL):
Proxy_ip =random.choice (proxyip.proxy_list) #在proxy_list中随机取一个ip
Print Proxy_ip
Proxy_support = Urllib2. Proxyhandler (PROXY_IP)
Opener = Urllib2.build_opener (proxy_support,urllib2. HttpHandler)
Urllib2.install_opener (opener)
Request = Urllib2. Request (URL)
User_agent = Random.choice (user_agents.user_agents) #在user_agents中随机取一个做user_agent
Request.add_header (' user-agent ', user_agent) #修改user-agent Field
Print User_agent
html = urllib2.urlopen (Request). Read ()
Return PROXY_IP
URLs = [' http://www.25shiyan.com/?fromuid=16 ', ' http://www.25shiyan.com/forum.php?mod=viewthread&tid=37840 &extra=page%3d1 ', ' http://www.25shiyan.com/forum.php?mod=viewthread&tid=36786&extra=page%3D1 ']
Count_true,count_false,count= 0,0,0
While True:
For URL in URLs:
Count +=1
Try
proxy_ip=gethtml (URL)
Except Urllib2. Urlerror:
print ' urlerror! The bad proxy is%s '%proxy_ip
Count_false + = 1
Except Urllib2. Httperror:
print ' httperror! The bad proxy is%s '%proxy_ip
Count_false + = 1
Except
print ' Unknown errors! The bad proxy is%s '%proxy_ip
Count_false + = 1
Randomtime = Random floating-point number between the Random.uniform (1,3) #取1-10
Time.sleep (randomtime) #随机等待时间
print '%d eroors,%d OK, total%d '% (count_false,count-count_false,count)



######################

The above modules are introduced:Proxyip,user_agents content as follows:

######################

proxyip.py

#!/usr/bin/python
#-*-Conding:utf-8-*-

Proxy_list = [
{' http ': ' http://59.53.67.215:80 '},
{' http ': ' http://60.161.14.77:8001 '},
{' http ': ' http://61.144.14.68:80 '},
{' http ': ' http://61.144.68.180:9999 '},
{' http ': ' http://61.164.108.84:8844 '},
{' http ': ' http://61.166.55.153:11808 '}
]

###########################

user_agents.py

#!/usr/bin/python
#-*-coding:utf-8-*-

Import random
user_agents = [
    ' mozilla/5.0 ( Windows; U Windows NT 5.1; It rv:1.8.1.11) gecko/20071127 firefox/2.0.0.11 ',
    ' opera/9.25 (Windows NT 5.1; U EN) ',
    ' mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1;. NET CLR 1.1.4322;. NET CLR 2.0.50727) ',
    ' mozilla/5.0 (compatible; konqueror/3.5; Linux) khtml/3.5.5 (like Gecko) (Kubuntu) ',
    ' mozilla/5.0 (X11; U Linux i686; En-us; rv:1.8.0.12) gecko/20070731 ubuntu/dapper-security firefox/1.5.0.12 ',
    ' LYNX/2.8.5REL.1 libwww-FM/ 2.14 ssl-mm/1.4.1 gnutls/1.2.9 '
]

####################################

1, proxy IP is currently just static list, want to make dynamic acquisition, not yet realized, follow-up consideration

2, URLs did not handle well, initially want to from a main battle point, crawl sub-link, and now has not realized


Python brushes web clicks via proxy

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.