Python version-Verification Code calibration Coding Rabbit Platform Introduction __python

Source: Internet
Author: User
Tags base64 md5 md5 encryption
Python version--Verification code check dozen Code Rabbit platform Application Introduction 1. Background

The full name of the verification Code (CAPTCHA) is a fully automated Turing test that distinguishes between computers and humans (completely automated public Turing test to tell Computers and humans). It is a challenge answering system test for distinguishing human from computer automatic program. Captcha can differentiate between humans and automated programs by setting up tasks that are easy for humans to perform and that are difficult for automated programs to accomplish.
CAPTCHA is often used to prevent automated programs from using blogs to influence search engine rankings, sign email accounts, send spam, or participate in online voting.
Typically, CAPTCHA is a slightly distorted alphanumeric character image file, and people can usually easily read the characters in the image. The automatic program recognizes that the content contains an image, but does not know what the image is. Considering the Amblyopia group, some captcha use audio files. In such a system, one can hear a letter or phrase and play what he hears to prove that he is not an automatic program. In general, there are two ways to automate the processing of validation code:
Optical Character recognition (optical Character recognition, OCR), extracting text from images。 The main library has pytesser,tesseract. Generally speaking, for complex verification code images, we need to modify the verification code image first, remove the background noise, only the text part, and then send it to the warehouse for resolution. And the recognition rate is not very high, want to improve the recognition rate, the need for a long time machine learning training, the cost is greater. And for extremely complex verification codes, OCR is not even available.using authentication Code processing services (with the help of a professional coding platform)。 Pay to call the APIs they provide. When the validation code picture is passed to their API, someone will view it manually and give the parsed text content in the HTTP response. Generally speaking, the entire parsing process in 10s (Playing Code rabbit Platform), the longest 60s, and the price is not high, 1 Yuan = 50~100 code (according to the type of verification code). If there is not much validation code, you can choose this approach to improve recognition efficiency and simplify operations.2. EnvironmentalPython 3.6.1 system: Win7 Ide:pycharm installed Chrome browser configured Chromedriver Selenium 3.7.03. The use process of the Code rabbit platform 3.1. PrincipleCall the API (access URL) in the specified format, and get the result returned. Playing Code rabbit Platform address: http://www.dama2.com/Coding rabbit Platform Developer Documentation: http://wiki.dama2.com/index.php3.2. Process

First step: Register a developer account .

Step two: Generate the ID and key of the software that uses the authentication code
Login to the developer account, enter my software, click to create software, fill in the data submission will generate a software key.

Note: Put your software key into the corresponding code parameters, and then the software key to the Code rabbit online customer service, he will help you lift the restrictions, you can use the account test, password test. As the figure shows, this is the software ID, software name, software key, are the parameters used by the program. When someone uses your software ID and software key to authenticate code consumption, the developer of the software is able to get the split. This is one of the ways to make money for ticket-grabbing software. In other words, developers account for software development, making money. Software does not need to be made public. And the developer account is not allowed to pay the code.

Step three: Register a ordinary user's account number , and recharge, use this account to pay the code.
that is, the developer account is used to make money, and the regular user account is used for the pay Code 4. Custom Method Original document Please refer to official website: http://wiki.dama2.com/index.php?n=ApiDoc.Http

# to the code of the Rabbit platform to make some changes, more appropriate to the requirements of the Software import hashlib import urllib.request import urllib import JSON import base64 import requ ESTs # MD5 Cryptographic string def md5str (str): M = Hashlib.md5 (Str.encode (encoding= "Utf-8")) return M.hexdigest () # MD5 encryption BYT e def MD5 (byte): M = hashlib.md5 (byte) return m.hexdigest () # Coding Rabbit API Class Damatuapi (): # Associated Developer Account # this account The default is 1~8 bit random English and number combination ID = ' 51773 ' # Developer account software AppID key = ' fbfb9022a1499b0c6436f223f98b714e ' # Developer account software key , my software generated key # Coding Platform host = ' http://api.dama2.com:7766/app/' def __init__ (self, username, password, Li 
        Mitcount): Self.username = Username Self.password = password # Limit the number of authentication codes per instance request, prevent accidents, cause excessive request verification code, and consume uncontrollable Self.limitcount = Limitcount # for statistical authentication code requests Self.count = 0 # Compute user signature, add key and username according to certain rules def getsign (self, param=b '): Return (md5 bytes (self.

  KEY, encoding= "utf8") + bytes (self.username, encoding= "UTF8") + param)) [: 8]  # Get the encrypted key, UserName, password def getpwd (self): return md5str (self.
        KEY + md5str (md5str (self.username) + md5str (Self.password)) # Submit Request Def post (self, urlpath, formData = {}) to the coding platform:
        ':p Aram URLPath: Used to construct URLs that request different data for different addresses:p Aram FormData: Submitted data: return: " url = self.
                                        HOST + URLPath try:response = requests.request (method= ' post '),
                                        Url=url, Data=formdata, timeout=60 Print (f "text = {Response.text}") # Balance, balance, Text = {"ret": 0 , "balance": "9957", "sign": "2428e5f9"} # Verification code, CAPTCHA, Text = {"ret": 0, "id": 558899326, "result": "230876", "sign": "8"  D55BDB1 "} return Response.text except Exception as E:print (f" postrequest error. Exception = {e}, URLPath = {URLPath}, FormData = {forMdata} ") return {" ret ":-1} # Query balance return is a positive number for the balance, and if negative is the error code def getbalance (self): data = { ' AppID ': self.id, ' user ': Self.username, ' pwd ': self.getpwd (), ' sign ': sel F.getsign ()} res = Self.post (' d2balance ', data) JREs = Json.loads (res) if jres[' ret '] = =  0:return (True, jres[' balance ']) Else:return (False, jres[' ret ']) # upload CAPTCHA picture def Decode (self, FilePath, type): "':p Aram FilePath: Authenticode picture path such as D:/1.jpg:p Aram type: Authenticode type, viewing http:/ /wiki.dama2.com/index.php?n=apidoc.pricedesc:return: tuple, result[0] = True to success, false to error code ' if self ' . Count >= Self.limitCount:print (f "decode: The number of request authentication codes exceeds the limit custom quantity.
        Return (False, false) # Get the data of the captcha picture f = open (FilePath, ' rb ') Fdata = F.read () Filedata = Base64.b64encode (fdata) f.close () data ={' AppID ': self.id, ' user ': Self.username, ' pwd ': self.getpwd (), ' type ': t ype, ' fileDataBase64 ': filedata, ' sign ': Self.getsign (fdata)} res = Self.post (' D2 File ', data ' JREs = json.loads (res) Self.count + + 1 if jres[' ret '] = = 0: # Notice this JSON has Ret,id,result,cookie, Get Return (True, jres[' result ') Else:return (False, jres[' re) according to your own needs  T ']) # URL address code, provide Authenticode link def decodeurl (self, URL, type): ':p Aram URL: Authenticode address:p Aram type:
        Authentication code type Http://wiki.dama2.com/index.php?n=ApiDoc.Pricedesc:return: Tuple, result[0] = True to success, false to error code ' If Self.count >= self.limitCount:print (f "Decodeurl: The number of request authentication codes exceeds the limit of the custom quantity.
            ") return (false, false) data = {' AppID ': self.id, ' user ': Self.username, ' pwd ': self.getpwd (), ' type ': TypE, ' url ': Urllib.parse.quote (URL), ' sign ': Self.getsign (Url.encode (encoding= "Utf-8"))}
            res = self.post (' D2url ', data) JREs = Json.loads (res) Self.count + + 1 if jres[' ret '] = = 0:
            # Notice that there are ret,id,result,cookie in this JSON, get return (True, jres[' result ') if you need to: Return (False, jres[' ret ']) # error, temporarily don't care. Parameter ID (string type) obtained return 0 for success by uploading the result of the coding function See error code def reporterror (self, id): data = {' AppID ': self. ID, ' user ': Self.username, ' pwd ': self.getpwd (), ' ID ': ID, ' sign ': self.ge Tsign (Id.encode (encoding= "Utf-8"))} res = Self.post (' d2reporterror ', data) res = str (res, Encodin
 g= "Utf-8") JREs = Json.loads (res) return jres[' ret ']
5. Use in embedded code

Verification Code screenshot, please refer to article: http://blog.csdn.net/zwq912318834/article/details/78605486 first step to ensure sufficient balance

# Coding Rabbit API class instantiation, parameter is code rabbit user account number and password. The last one is to limit the number of verification code, to prevent failure, the infinite Brush verification code, consumption too Big
# DMT=DAMATUAPI ("Test", "Test")
# At present, the software verification code positioning as: 1~8 Digital English mix, the title of 21 points in the
DMT = Damatuapi ("Ancode", "ancode2017")
# First check the balance is sufficient
# balanceres = (True, ' 9931 ')
balanceres = Dmt.getbalance ()   # query Balance
if balanceres[0] = = True and int (balanceres[1]) > 0:
    # Balance sufficient, you can safely crawl
    print (f " Main:balanceres = {Balanceres} ")
    # Start crawling Data
    # ...
The second step is verifying code verification.
Detects if the verification code appears. Screenshot of the verification code. Send a CAPTCHA image to decode. Submit validation code resolution results to the Web site. Detects if the verification code is successful. Note: Each time the page changes, it is possible to get the element that was previously lost, and need to be retrieved again.
# Part of the code # processing images through image, intercepting Captcha image imgcaptcha.save (' clawerimgs/captcha.png ') # Send the CAPTCHA code to the rabbit to decode # coderes = (True, ' Fmae ') # 56 represents Authentication code type, 1~8 digit English combination # Reference: Http://wiki.dama2.com/index.php?n=ApiDoc.Pricedesc damatures = Damatuinstance.decode (' Clawerimgs/captcha.png ', "while (damatures[0] = False): if damatures[1] = = False:raise Exception (f" dozen yards of rabbit beyond the custom The maximum number of limits, terminate the software.
        Damatures = {Damatures} "else: # Authentication code request failed, query under balance, see adequacy balanceres = Damatuinstance.getbalance () If balanceres[0] = = True and int (balanceres[1]) > 0:damatures = Damatuinstance.decode (' Clawerimgs/captcha PNG ', Time.sleep (2) Else: # Playing code rabbit balance Not enough, throw an exception, terminate the software raise Exception (f "dozen Code rabbit balance insufficient , or exceeds the maximum number of custom limits, terminating the software. Balanceres = {Balanceres} ") # click submit button input = Browser.find_element_by_xpath ("//div[@class = ' input ']/input[@id = ' j_
Codeinput '] "input.clear () Input.send_keys (damatures[1]) print (f" Send authentication code: damatures = {damatures} ") Time.sleep (3) # Submit Submit = Browser.fInd_element_by_xpath ("//button[@id = ' j_submit ']") Submit.click () Time.sleep (5) ################################### ############# # After the completion of the verification code, return to the main page, continue to crawl merchandise browser.switch_to.default_content () time.sleep (2) Browser.refresh () # After processing the page refresh Time.sleep (5) # then re-detect the existence of the verification code, if it is still in the cycle of repeated processing, if not, the following process will not continue Captchahandler (browser, damatuinstance)
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.