Python implementation of Baidu Webmaster automatic URL submission gadget _python

Source: Internet
Author: User

URL submission is a webmaster tool provided by Baidu, for the webmaster to provide manual collection of some URL interface, but the interface has a verification code identification part, more difficult to get. Therefore, the following program was written to identify the verification code automatically:

Main ideas

Get multiple authentication codes, submit to http://lab.ocrking.com/for multiple recognition, and then calculate the number of letters or numbers identified by each validation code picture, and the highest statistical rate is the verification code.

Copy Code code as follows:

#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import requests
Import time
Import JSON
Import re


if __name__ = = "__main__":
i = 1
s = requests.session ()
S.headers.update ({' Referer ': ' Http://zhanzhang.baidu.com/sitesubmit/index ', ' user-agent ': ' mozilla/5.0 (Windows NT 6.1; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/33.0.1750.154 safari/537.36 '})
r = S.get (' Http://zhanzhang.baidu.com/sitesubmit/index ')
S2 = requests.session ()
r = S.post (' Http://zhanzhang.baidu.com/captcha ', data={' async ': ' false ', ' n ': Time.time ()})
url = json.loads (r.content) [' URL ']
temp = []
While 1:
Try
r = S.get (URL)
Img_data = R.content
r = S2.get (' http://lab.ocrking.com/')
Try
Content = '. Join (R.content.split ())
Sid = Re.findall (R ' "Sid": "(. +?)" ', content) [0]
hash_1 = Re.findall (R ' "Hash": "(. +?)" ', content) [0]
Timestamp = Re.findall (R ' "Timestamp": "(. +?)" ", content) [0]
Except
print ' Error on get orking info! '
Continue
Files = {' Filedata ':(' icode.jpeg ', Img_data)}
data = {' Filename ': ' icode.jpeg ', ' Sid ': Sid, ' Hash ': hash_1, ' timestamp ': timestamp}
r = S2.post (' http://lab.ocrking.com/upload.html ', files = files,data= data)
r = S2.post (' http://lab.ocrking.com/ocrking.html ', data={' upfile ': r.content, ' type ': ' Captcha ', ' CharSet ': ' 7 '})
Icode = Re.findall (R ' <OcrResult> (. +?) </OcrResult> ', r.content) [0]
If Len (Icode)!= 4:
Continue
Temp.append (Icode)
i = i + 1
if i = = 3:
Break
Except Exception,e:
Print E
Pass

A = {' 0 ': {}, ' 1 ': {}, ' 2 ': {}, ' 3 ': {}}
For AA in Temp:
i = 0
While I <=3:
Try
A[str (i)][aa[i]] = A[STR (i)][aa[i]] + 1
Except
A[str (i)][aa[i]] = 1
i = i + 1
Icode = [', ', ', ', ', ']
For index in a:
Temp_times = 0
For index_1 in A[index]:
If a[index][index_1] >= temp_times:
Temp_times = A[index][index_1]
Icode[int (index)] = Index_1

Icode = '. Join (Icode)

Img_name = ' temp\\ ' +icode+ '. png '
File_object = open (Img_name, ' W ')
File_object.write (Img_data)
File_object.close ()



#r = S.post (' http://zhanzhang.baidu.com/sitesubmit/sitepost ', data={' url ': ' http://lab.ocrking.com/', ' captcha ': Icode})

#print r.content

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.