Python mortal notes-file Verification

Source: Internet
Author: User

I haven't written this article for a long time, so I can take a rest in the past two days. I would like to take this opportunity to summarize my recent experiences in using python.

In my personal experience, file verification is often used when downloading files. The simplest implementation method in Linux is:

 
1 $ Md5sum filename221c7ee192e64569ce43cfb869bdb2755 filename

Of course, there are corresponding modules in Python that can implement this function. MD5 modules can be used before python2.5, but hashlib is recommended to replace the MD5 module after python2.5. Simplest implementationCodeAs follows:

 1   #  ! /Usr/bin/ENV Python  2   #  Coding: UTF-8  3   4  Import  Sys  5   Import  Hashlib  6   7   Def  Md5sum (filename ):  8 File_object = open (filename, '  RB  '  )  9 File_content = File_object.read () 10   File_object.close ()  11 File_md5 = Hashlib. MD5 (file_content)  12       Return  File_md5  13   14   If   _ Name __ = "  _ Main __  "  :  15 File_md5 = md5sum (SYS. argv [1 ])  16       Print File_md5.hexdigest ()

Zhu Feng thinks there are two points worth noting:
One is to input hashlib. the MD5 () should be file_object.read (). In this way, the MD5 verification code is generated for the file content. At the beginning, Zhu Feng did not use the read () method, instead, input filename (such MD5 is generated for the file name), resulting in Incorrect verification code;
In addition, after hashlib. MD5 () is called, an object is returned. To achieve the same effect of md5sum in Linux, you must call the hexdigest () method.

Of course, the above Code is not fully considered. If you want to verify a large file, the file content will be read into the memory once, resulting in performance defects. For personal comparisons, we recommend that you use the following code for http://ryan-liu.iteye.com/blog/1530029.pdf:

 1   #  ! /Usr/bin/ENV Python  2   # Coding: UTF-8  3   Import  Hashlib  4   5   Def  Md5hex (Word ):  6       """  MD5 EncryptionAlgorithmReturns the 32-bit lowercase hexadecimal symbol.  7       """  8       If  Isinstance (word, Unicode ): 9 WORD = word. encode ( "  UTF-8  "  )  10       Elif   Not  Isinstance (word, STR ):  11 WORD = STR (word)  12 M = Hashlib. MD5 ()  13   M. Update (word) 14       Return  M. hexdigest ()  15   16   Def  Md5sum (fname ):  17       """  Calculate the MD5 value of a file  18       """  19       Def  Read_chunks (FH ):  20   FH. Seek (0) 21 Chunk = FH. Read (8096 )  22           While  Chunk:  23               Yield  Chunk  24 Chunk = FH. Read (8096 )  25           Else : #  Put the cursor back at the beginning of the file.  26  FH. Seek (0)  27 M = Hashlib. MD5 ()  28       If  Isinstance (fname, basestring )\  29               And  OS. Path. exists (fname ):  30 With open (fname, "  RB  "  ) As FH:  31              For Chunk In  Read_chunks (FH ):  32   M. Update (chunk)  33       #  Uploaded file cache or opened file stream  34       Elif Fname. _ Class __ . _ Name __   In [ "  Stringio " , "  Stringo  "  ] \  35               Or  Isinstance (fname, file ):  36           For Chunk In  Read_chunks (fname ):  37   M. Update (chunk)  38       Else :  39           Return   ""  40       Return M. hexdigest ()

This code is powerful enough to read 8 K of content each time, and then call Update () to update MD5.
PS: Why 8 K? This involves the IO size. Provide an articleArticle, Interested can take a look at the knowledge of: http://blog.sina.com.cn/s/blog_6200c1440100vt4z.html

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.