Python to achieve website PR and Baidu weight

Source: Internet
Author: User
Tags ord
The last time I used the requests library to write a crawl page link in the simple code, extension, we can also use it to get our website PR and Baidu weight. The principle is similar. Finally, we can even write a loop to query the site of the bulk of the relevant information.

First talk about Googlepr, full name PageRank. It is Google's official assessment of a website SEO rating, this should not be unfamiliar. Since it is officially given, of course there is an official interface to get it. We use the official interface to get Google Pr.

The code is as follows:


Gpr_hash_seed = "Mining PageRank is against GOOGLE ' S TERMS of SERVICE. Y\
Es, I ' m talking to you, scammer. "

def google_hash (value):
Magic = 0x1020345
For i in Xrange (len (value)):
Magic ^= Ord (gpr_hash_seed[i% len (gpr_hash_seed)) ^ ord (Value[i])
Magic = (Magic >> | Magic << 9) & 0xFFFFFFFF
Return "8%08x"% (Magic)

def GETPR (WWW):
Try
url = ' Http://toolbarqueries.google.com/tbr? ' \
' client=navclient-auto&ch=%s&features=rank&q=info:%s '% (Google_hash (www), www)
Response = requests.get (URL)
Rex = Re.search (R ' (. *?:.*?:) (\d+) ', Response.text)
Return Rex.group (2)
Except:
Return None

How to use: Incoming domain name, return PR value

Google_hash This function is just an algorithm that calculates a domain name that resembles a hash value and returns. We can not control how it is implemented, we mainly look at GETPR this function. Our official Google interface is this: Http://toolbarqueries.google.com/tbr?client=navclient-auto&ch={hash}&features=rank &q=info:{Domain}

{Hash} Here we use Google_hash () This function, passed in the domain name, return its corresponding HASH value. For example, our farewell song domain name www.leavesongs.com, its Google hash is 8b1e6ad00, so the construction of the consultation site is: http://toolbarqueries.google.com/tbr?client= Navclient-auto&ch=8b1e6ad00&features=rank&q=info:www.leavesongs.com

Access it and get rank_1:1:0. The number after the second quotation mark is PR, because my station is no PR, so the PR is 0.

So, we use Requests.get () to access the constructed URL, and then get a result like rank_1:1:0, and finally get the PR value of 0 by regular or other means.

The above is the execution of the GETPR function. Then see the process of acquiring Baidu weight.

Baidu weight is not the official Baidu to give a standard, is a number of third-party website calculation of a value, so there is no interface like PR. So we need to crawl the information in these third-party websites. Here is the function to get Baidu weight:

The code is as follows:


def GETBR (WWW):
Try
url = ' http://mytool.chinaz.com/baidusort.aspx?host=%s&sortType=0 '% (www,)
Response = requests.get (URL)
data = Response.text
Rex = Re.search (R ' (. +?) (\d*?) () ', Data,re. I)
Return Rex.group (2)
Except:
Return None

The use method is also the incoming domain name, which returns the weight value.

I crawl is webmaster Tools a weight Consulting page: http://mytool.chinaz.com/baidusort.aspx?host={Domain name}&sorttype=0

My regular Is it: (. +?) (\d*?) (), you can see the source code to see, you know how to write the regular.

OK, let's get the PR and weights for these sites in bulk:

See the results directly:

A single process sweep words will be slightly slower, open 10 20 threads in bulk to get the words should be relatively fast.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.