Implementation of PDF file Integrity verification with C #

Source: Internet
Author: User
Tags header md5 tostring

Now the integrity of the file verification, to prevent the file is tampered with the technology has been more mature, the general use of digital signatures, digital watermarking, and so on, I recently encountered in a project tamper-proof requirements. The project requires the user to scan the original invoice into a PDF file with a special scanner, and upload the PDF file to the server, and must verify that the PDF has not been manually modified while uploading. When I first came into contact with this demand, I thought of using digital watermarks, otherwise, directly using the PDF digital signature function, but these methods are more complex, a lot of English documents are not mind to study, so pondering a half-day, wrote a simplified version of the Digital watermark program, the realization of the PDF file Integrity verification.

The basic idea of verification is:

Calculates its MD5 value for the entire contents of the file, so that no matter where the user modifies the file, so the generated MD5 is completely different, we can write this MD5 to a hidden area of the file, the general binary file format has the file header and the file body part, and the file header is not visible to the user, Typically, a portion of the byte is reserved for later expansion, or data that can be written to special tags in the file header. Then studied the PDF file format, try to its 10th byte inserted the MD5 value, the resulting file, although it can be used, but every time you open the prompt "file repair." It was written on the head. The number of bytes in the PDF file and the address of the object in the file changed, caused a file error, the reason found so the solution is there, in order not to change the object in the PDF file address, then we will write this MD5 at the end of the file can not! The client (scanner) then calculates the MD5 value of the scanned PDF file stream, and then writes the file stream and the MD5 value to the hard disk to form a PDF file with the MD5 value added. The file can be opened and used normally, and the user will not see the MD5 value we added.

On the server side, we will upload the file stream in addition to the last 32 bytes to calculate the part of the MD5 value (32 bytes here because the last 32 bytes we write MD5), the previous part of the MD5 and the last 32 bytes of MD5 comparison, If that is the case, the file is not tampered with since it was generated by the scanner, otherwise the file is either not generated by our scanner or tampered with. This allows us to stream the file to the server's hard drive after the validation passes.

Related program code

1 public class MD5

2 {


3/**////<summary>


4///Label the file for a given file path


5///</summary>


6///<param name= "path" > Paths for files to be encrypted </param>


7///<returns> label value </returns>


8 public static string Md5pdf (String path,string key)


9 {


10


Try


12 {


FileStream get_file = new FileStream (path, FileMode.Open, FileAccess.Read, FileShare.Read);


byte[] Pdffile = new Byte[get_file. Length];


Get_file. Read (pdffile, 0, (int) get_file. LENGTH);//Read file stream to buffer


Get_file. Close ();


17


string result = Md5buffer (pdffile, 0, pdffile.length);//to byte content in buffer MD5


results = md5string (result +key);//Here The key is the equivalent of a key, so the average person is aware of using the MD5 algorithm, but if you do not know the string is still unable to calculate the correct MD5


20


byte[] MD5 = System.Text.Encoding.ASCII.GetBytes (result);//Convert a string to a byte array for writing to a file


22


FileStream fswrite = new FileStream (path, FileMode.Open, FileAccess.ReadWrite);


fswrite.write (pdffile, 0, pdffile.length); the pdf file, the MD5 value, is rewritten to the file.


fswrite.write (MD5, 0, MD5. Length);


//fswrite.write (Pdffile, pdffile.length-10);


Fswrite.close ();


28


return result;


30}


catch (Exception e)


32 {


return e.tostring ();


34}


35}


/**////<summary>


37///To verify the files for a given path


///</summary>


///<param name= "path" ></param>


///<returns> whether a label or label value is consistent with the content value </returns>


public static bool Check (String path,string key)


42 {


Try


44 {


FileStream get_file = new FileStream (path, FileMode.Open, FileAccess.Read, FileShare.Read);


46


47


byte[] Pdffile = new Byte[get_file. Length];


Get_file. Read (pdffile, 0, (int) get_file. Length);


Get_file. Close ();


Wuyi String result = Md5buffer (pdffile, 0, pdffile.length-32);//to the PDF file, except for the last 32 bytes, this 32 is because the label bit is 32 bits.


= md5string (result + key);


53


string md5 = System.Text.Encoding.ASCII.GetString (Pdffile, pdffile.length-32, 32);//Read the last 32 bits of the PDF file, which saves the MD5 value


return result = = MD5;


56}


Catch


58 {


59


return false;


61


62}


63}


private static string Md5buffer (byte[] pdffile, int index, int count)


65 {


System.Security.Cryptography.MD5CryptoServiceProvider get_md5 = new System.Security.Cryptography.MD5CryptoSe Rviceprovider ();


byte[] Hash_byte = Get_md5.computehash (pdffile, index, count);


68


string result = System.BitConverter.ToString (hash_byte);


result = result. Replace ("-", "");


return result;


72}


private static string md5string (String str)


74 {


byte[] Md5source = System.Text.Encoding.ASCII.GetBytes (str);


The return Md5buffer (Md5source, 0, md5source.length);


77


78}


79}

The above code applies not only to PDF files, but also to other formats, depending on the format specification of the file.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.