Python automatically clicks the instance code of blog views, and python views

Source: Internet
Author: User

Python automatically clicks the instance code of blog views, and python views

Train of Thought Source

Today, it was an occasional opportunity to hear others talking about the current "Click farming" behavior, which inspired my curiosity. Then I read that the requests module is useful to me and wrote a simple test case. The magic of discovering this trick actually works. What are you waiting for? Click it.

Prelude

The idea is simple, that is, the implementation of sending requests. The Code is as follows:

headers = {  'referer':'http://jb51.net/',  'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36'}def getHtml(url,headers):  req = urllib2.Request(url,headers=headers)  page = urllib2.urlopen(req)  html = page.read()  return html

We can manually add the target url and a headers. I will naturally use my own tests.

After the code is run, the page views can be increased.

Slow growth

Since such a solution is feasible, it indicates that the idea is correct. So we will naturally think of writing a loop. In this way, can a large amount of browsing increase?
Yes, I did. The Code is as follows:

i= 0while i < 10:  url = 'http://jb51.net/marksinoberg/article/details/51501377'  getHtml(url,headers)

At the beginning, we can see an increase in the number of blogs ...... Initial success. However, this is not a long time. I found that after 10 more page views. That's it.

Then I cannot add it. It is estimated that the server imposes some restrictions on my access. Otherwise, it should be feasible.

Find a solution

As the saying goes, "there are policies and countermeasures." Naturally, I cannot accept this constraint. So I guess I recorded my IP address. Then, I added some restrictions on the number of visits.

My solution:

  1. Proxy IP address for access: but because there is no server, the proxy cannot access the IP address.
  2. Change IP Address: in this case, I want to change my ip address for access. How can I change my IP address? (Now I think I regret it. At that time, the computer network didn't have a good lecture, and IP spoofing didn't learn well. Otherwise, I won't be able to use it now ). But there are other ways to use Rome. As follows:

C: \ Users \ Administrator> ipconfig/release

Windows IP configuration

You cannot perform any operations on the local connection. The media connection is disconnected.

Wireless LAN adapter wireless network connection:

Connection to a specific DNS suffix .......:
Local IPv6 address ......: fe80: 1d9f: d97b: fd16: 1f6f %
Default Gateway .............:

Local Ethernet Adapter connection:

Media status ......: the media has been disconnected.
Connection specific DNS suffix...: OurEDA.cn

Ethernet Adapter VMware Network Adapter VMnet1:

Connection to a specific DNS suffix .......:
Local IPv6 address ......: fe80: 359d: e81d: 741: f257 % 1
IPv4 address ......: 192.168.229.1
Subnet Mask ......: 255.255.255.0
Default Gateway .............:

Ethernet Adapter VMware Network Adapter VMnet8:

Connection to a specific DNS suffix .......:
Local IPv6 address ......: fe80: 94b1: d10f: b68: 101d % 1
IPv4 address ......: 192.168.244.1
Subnet Mask ......: 255.255.255.0
Default Gateway .............:

Ethernet Adapter VirtualBox Host-Only Network:

Connection to a specific DNS suffix .......:
Local IPv6 address ......: fe80: a5eb: 545c: 7d89: 9451%
IPv4 address ......: 192.168.56.1
Subnet Mask ......: 255.255.255.0
Default Gateway .............:

Tunnel adapter isatap. {4F399971-B739-4B71-BD79-E48233EEC9BE }:

Media status ......: the media has been disconnected.
Connection to a specific DNS suffix .......:

Tunnel adapter isatap. {1860C94E-1007-4418-9A26-7D8AA8F06E15 }:

Media status ......: the media has been disconnected.
Connection to a specific DNS suffix .......:

Tunnel adapter isatap.OurEDA.cn:

Media status ......: the media has been disconnected.
Connection to a specific DNS suffix .......:

Tunnel adapter isatap.dlut.edu.cn:

Media status ......: the media has been disconnected.
Connection to a specific DNS suffix .......:

Tunnel adapter isatap. {6F7F27ED-942E-4EFB-ACF2-A4E8793B161D }:

Media status ......: the media has been disconnected.
Connection to a specific DNS suffix .......:

C: \ Users \ Administrator> ipconfig/renew

Windows IP configuration

You cannot perform any operations on the local connection. The media connection is disconnected.

Wireless LAN adapter wireless network connection:

Connection to a specific DNS suffix .......:
Local IPv6 address ......: fe80: 1d9f: d97b: fd16: 1f6f % 12
IPv4 address ......: 192.168.58.70
Subnet Mask ......: 255.255.252.0
Default Gateway ......: 192.168.56.1

Local Ethernet Adapter connection:

Media status ......: the media has been disconnected.
Connection specific DNS suffix...: OurEDA.cn

Ethernet Adapter VMware Network Adapter VMnet1:

Connection to a specific DNS suffix .......:
Local Link IPv6 address ......: fe80: 359d: e81d: 741: f257 % 14
IPv4 address ......: 192.168.229.1
Subnet Mask ......: 255.255.255.0
Default Gateway .............:

Ethernet Adapter VMware Network Adapter VMnet8:

Connection to a specific DNS suffix .......:
Local IPv6 address ......: fe80: 94b1: d10f: b68: 101d % 15
IPv4 address ......: 192.168.244.1
Subnet Mask ......: 255.255.255.0
Default Gateway .............:

Ethernet Adapter VirtualBox Host-Only Network:

Connection to a specific DNS suffix .......:
Local IPv6 address ......: fe80: a5eb: 545c: 7d89: 9451% 16
IPv4 address ......: 192.168.56.1
Subnet Mask ......: 255.255.255.0
Default Gateway .............:

Tunnel adapter isatap. {4F399971-B739-4B71-BD79-E48233EEC9BE }:

Media status ......: the media has been disconnected.
Connection to a specific DNS suffix .......:

Tunnel adapter isatap. {1860C94E-1007-4418-9A26-7D8AA8F06E15 }:

Media status ......: the media has been disconnected.
Connection to a specific DNS suffix .......:

Tunnel adapter isatap.OurEDA.cn:

Media status ......: the media has been disconnected.
Connection to a specific DNS suffix .......:

Tunnel adapter isatap.dlut.edu.cn:

Media status ......: the media has been disconnected.
Connection to a specific DNS suffix .......:

Tunnel adapter isatap. {6F7F27ED-942E-4EFB-ACF2-A4E8793B161D }:

Media status ......: the media has been disconnected.
Connection to a specific DNS suffix .......:

Yes, you must have seen it. The two core commands are:

// Change the configuration of the route table ipconfig/release // release the network, ipconfig/renew // resend the IP

In this way, the IP address changes. Especially for LAN users.

Therefore, I only need to call the cmd command of the system in the Python code to dynamically change my IP address. Then, my needs are met.

Difficulties

Although the IP address problem is solved, it is still too slow to do so. Because it takes time to update the route table. Compared with the code running speed, this is really too slow and too slow. In addition, only 10 page views can be refreshed at a time. It is indeed quite embarrassing. It took so much effort to get ten page views. How can this problem be solved?

I did not actually solve this problem, but I found that this restriction is not particularly strong, because I had a meal midway through, when I came back, I found that the original IP address could be used again. About 45 minutes! This is a breakthrough.
Source code

In fact, the idea is very simple, that is, trying to solve the problem. No matter how strong the other party's system is, it cannot be seamless. There will always be a solution. The following is the code.

# Coding: UTF-8 # _ author _ = 'Mark sinoberg '# _ date _ = '2017/26' # _ Desc _ = test and refresh the import of the browser views of your blog urllib2, refrom bs4 import BeautifulSoupdef getHtml (url, headers): req = urllib2.Request (url, headers = headers) page = urllib2.urlopen (req) html = page. read () return htmldef parse (data): content = BeautifulSoup (data, 'lxml') return contentdef getReadNums (data, st): reg = re. compile (st) return re. findall (reg, data) url = 'HTTP: // jb51.net/marksinoberg/article/details/51493318'headers = {'Referer': 'http: // jb51.net/', 'user-agent ': 'mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/100'} I = 0 while I <24: html = getHtml (url, headers) content = parse (html) result = content. find_all ('span ', class _ = 'link _ view') print result [0]. get_text () I = I + 1

Code running result:

D: \ Software \ Python2 \ python.exe E:/Code/Python/MyTestSet/ulib2/AddWatcher. py94 read 95 read 96 read 97 read 98 read 99 read 100 read 101 read 102 read 103 read 104 read 105 read 106 read 107 read reading 108 people reading 109 people reading 110 people reading 111 people reading 112 people reading 113 people reading 114 people reading 115 people reading 115 people reading 115 people reading Process finished with exit code 0

The better thing is to use BeautifulSoup to capture the data at a specific location, where the page views are captured. From the above results, we can also see that there is a limit on the data volume captured by an IP address, generally 10 ~ 30, which seems to be 22 visits.

Outlook

In fact, I can achieve the effect of multiple refresh operations at a time, but this is not especially decent, so let me talk about my own ideas.

  1. Make a judgment on the result (page views). When two consecutive results are consistent, enable python to execute the cmd command to update its IP address. However, this operation is time-consuming and can be put into a thread.
  2. In addition, you can crawl your blog list interface to obtain all your blog posts. Of course, simulated login is obviously used here. Then, refresh each blog. In this way, although the problem is not solved, it will also have a good effect.
  3. Make a thread that regularly refreshes the volume at XX time. In this way, an article may be able to access hundreds of users in a day. (I have never tried, and I don't know)

The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.