Python implementation code for automatic login with verification code website

Source: Internet
Author: User
Tags image processing library
I heard that Python is very convenient to do web crawler, just this few days units also have such needs, the need to visit XX website to download some documents, so that their own personal test, the effect is good.

In this example, a website that is logged in needs to provide a username, password, and verification code that uses Python's urllib2 to log in directly to the website and process cookies on the site.

How Cookies work:
The cookie is generated by the server and then sent to the browser, which stores the cookie in a text file in a directory. The next time you request the same Web site, the cookie is sent to the server, so the server knows if the user is legitimate and needs to log back on.

Python provides a basic Cookielib library, where cookies are automatically saved when a page is first accessed, and then access to other pages with a normally logged-on cookie.

Principle:

(1) Activating the cookie function
(2) Anti-"anti-hotlinking", disguised as browser access
(3) Access the Verification Code link and download the captcha image to a local
(4) Verification Code identification scheme more online, Python also has its own image processing library, this example called the locomotive Collector's OCR recognition interface.
(5) The processing of the form, can be used fiddler and other grasping the package tool to get the parameters to be submitted
(6) Generate data to be submitted, generate an HTTP request and send
(7) based on the returned JS page to determine whether the login success
(8) Download other pages after successful login

This example uses multiple accounts to poll the login, each of which downloads 3 pages.

Download URL Because of some problems, it is not disclosed.

Here's some code:

#!usr/bin/env python#-*-coding:utf-8-*-import osimport urllib2import urllibimport cookielibimport Xml.etree.ElementTree as et#-----------------------------------------------------------------------------# Login In Www.***.com.cndef chinabiddinglogin (URL, username, password): # Enable cookies for Urllib2 Cookiejar=cooki Elib. Cookiejar () Urlopener=urllib2.build_opener (urllib2. Httpcookieprocessor (Cookiejar)) Urllib2.install_opener (Urlopener) urlopener.addheaders.append (' Referer ', ' http:    Www.chinabidding.com.cn/zbw/login/login.jsp ') urlopener.addheaders.append ((' Accept-language ', ' ZH-CN ')) Urlopener.addheaders.append (' Host ', ' www.chinabidding.com.cn ') urlopener.addheaders.append (' User-agent ', ' mozilla/5.0 (compatible; MISE 9.0; Windows NT 6.1); trident/5.0 ') urlopener.addheaders.append ((' Connection ', ' keep-alive ')) print ' XXX Login ... ' imgurl=r '/http Www.*****.com.cn/zbw/login/image.jsp ' DownloadFile (Imgurl, Urlopener) authcode=raw_input (' Please enter the Authcode: ') #authcode =verifyingcoderecognization (r "http://192.168.0.106/images/ Code.jpg ") # Send Login/password to the site and get the session cookie values={' login_id ': username, ' opl ': ' Op_login ', ' login_passwd ':p assword, ' Login_check ': Authcode} urlcontent=urlopener.open (Urllib2. Request (URL, Urllib.urlencode (values))) Page=urlcontent.read (500000) # Make sure we is logged in, check the Returne  D page Content if Page.find (' login.jsp ')!=-1:print ' login failed with username=%s, password=%s and authcode=%s '        \% (username, password, authcode) return False else:print ' Login succeeded! ' return true#-----------------------------------------------------------------------------# Download from FILEURL    Then save to filetosave# note:the FILEURL must be a valid FileDef DownloadFile (FILEURL, Urlopener): Isdownok=false Try:if Fileurl:outfile=open (R '/var/www/images/code.jpg ', ' W ') Outfile.write (Urlopener.open (urllib2. Request (FILEURL)). Read ()) Outfile.close () isdownok=true else:print ' Error:fileur    L is null! ' Except:isdownok=false return isdownok#-------------------------------------------------------------------------    -----# Verifying Code recoginizationdef verifyingcoderecognization (imgurl): Url=r ' Http://192.168.0.119:800/api? ' User= ' admin ' pwd= ' admin ' model= ' ocrfile= ' CBI ' values={' user ': User, ' pwd ':p wd, ' model ': Model, ' ocrfile ': OC    Rfile, ' Imgurl ': Imgurl} data=urllib.urlencode (values) try:url+=data urlcontent=urllib2.urlopen (URL) Except Ioerror:print ' ***error:invalid URL (%s) '% URL page=urlcontent.read (500000) # Parse the XML data And get the Verifying Code root=et.fromstring (page) node_find=root.find (' AddField ') authcode=node_find.attrib[' da Ta '] return authcode#------------------------------------------------------------------------------# Read users from Configure FileDef readusersfromfile (filename): users={} for Eachline in Open (fil    ename, ' R '): Info=[w for W in Eachline.strip (). Split ()] If Len (info) ==2:users[info[0]]=info[1] Return users#------------------------------------------------------------------------------def main (): Login_page =r ' http://www.***.com.cnlogin/login.jsp ' download_page=r ' http://www.***.com.cn***/***?record_id= ' start_id= 8593330 end_id=8595000 now_id=start_id users=readusersfromfile (' users.conf ') while true:for key on use Rs:if Chinabiddinglogin (Login_page, Key, Users[key]): For I in range (3): Pag Eurl=download_page+ '%d '% now_id urlcontent=urllib2.urlopen (pageurl) filepath= './dow                    Nload/%s.html '% now_id f=open (filepath, ' W ') F.write (Urlcontent.read (500000))              F.close ()      Now_id+=1 else:continue#----------------------------------------------------------------- -------------if __name__== ' __main__ ': Main ()
  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.