"Go" python to determine file type by file header

Source: Internet
Author: User

Just saw a fun program, pull it over. Original address: https://www.ttlsa.com/python/determine-file-type-by-the-file-header/infringement delete.

============================== Divider Line ==============================

For the server to provide the upload, the uploaded files need to be filtered, otherwise various Webshell, Bauku.

Importstruct#Supported file Types#The purpose of using a 16 binary string is to know how many bytes the file header is#the length of each file header is different, less than 2 characters, and 8 characters longdeftypelist ():return {          "52617221": Ext_rar,"504b0304": Ext_zip}#byte code to 16 binary stringdefBytes2hex (bytes): num=len (bytes) Hexstr= u""       forIinchrange (num): t= u"%x"%Bytes[i]ifLen (t)% 2: Hexstr+ = U"0"hexstr+=TreturnHexstr.upper ()#Get file Typedeffiletype (filename): Binfile= open (filename,'RB')#required binary word readingTL =typelist () ftype='Unknown'       forHcodeinchTl.keys (): Numofbytes= Len (hcode)/2#how many bytes to readBinfile.seek (0)#go back to the file header every time you read it, or you'll read it backwards.Hbytes = Struct.unpack_from ("B"*numofbytes, Binfile.read (numofbytes))#a "B" means one byteF_hcode =Bytes2hex (hbytes)ifF_hcode = =Hcode:ftype=Tl[hcode] Breakbinfile.close ()returnftypeif __name__=='__main__':      PrintFileType (Your-file-path)

File headers for common file formats

File format file header (hex) JPEG (jpg) ffd8ffpng (PNG) 89504e47gif (GIF) 47494638TIFF (TIF) 49492a00windows Bitmap (BMP) 424DCAD (DWG) 4 1433130Adobe Photoshop (PSD) 38425053Rich Text Format (RTF) 7b5c727466xml (XML) 3c3f786d6chtml (HTML) 68746d6c3eemail [tho Rough only] (EML) 44656c69766572792d646174653aoutlook Express (DBX) cfad12fec5fd746foutlook (PST) 2142444EMS Word/excel (Xls.or.doc) d0cf11e0ms Access (MDB) 5374616e64617264204awordperfect (WPD) ff575043postscript (eps.or.ps) 252150532d41646f6265adobe Acrobat (pdf) 255044462d312equicken (QDF) ac9ebd8fwindows Password (PWL) E3828596zip Archive ( Zip) 504b0304rar Archive (RAR) 52617221Wave (WAV) 57415645AVI (AVI) 41564920Real Audio (RAM) 2e7261fdreal Media (RM) 2E524 D46mpeg (MPG) 000001BAMPEG (MPG) 000001b3quicktime (mov) 6d6f6f76windows Media (ASF) 3026b2758e66cf11midi (mid) 4d546864

============================== Divider Line ==============================

In other words, for files uploaded to the server, some may be modified by the extension to confuse the malicious file, this time you can judge the file header, see if it is not really the extension of the file shown, if it is released.

"Go" python to determine file type by file header

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.