Office file encryption Detection

Source: Internet
Author: User

Knowledge Overview

The binary format of WPS series dps wps et in Kingsoft is the same as Microsoft's office series PPT Word Excel. OpenOffice should also be compatible with Microsoft's binary format. The reason for this is that Microsoft is the boss. However, with anti-monopoly protection, Microsoft still wants to disclose its binary format to other vendors. Start with the question.

You must know the following before performing binary Detection.

1. Warehouse (there is a very important root warehouse root storage)

2. Standard stream

3. Storage of short streams and short streams

4. Sector

5. Short sector

6. Main sector configuration table, sector configuration table, and short sector configuration table

The simple relationship is organized as follows:

1. The repository contains a stream (either a standard stream or a short stream), just as there are files in drive D.

2. A repository can contain a repository, just as a d disk can contain folders.

3. The minimum unit of storage is sector. A stream is a combination of several sectors, and the slice configuration table specifies the relationship between these combinations.

4. A short stream is a stream smaller than the standard stream size, while a short stream is also a combination of some sectors, but these sectors can be divided into short sector units.

5. Main sector configuration table, which specifies the sectors used to store the sector configuration table

6. Both the slice configuration table and the short slice configuration table are used to specify the slice chain corresponding to a stream.

If the above relationship is unclear, see explain.

 

Encryption Detection

 

The first 512 bytes of the composite document are well formatted. Here we can find the sector size, the root warehouse entry address, and the primary sector configuration table.

Before entering the analysis, we first need to find the root warehouse's portal address (that is, the directory's portal address)

Each directory is 128 bytes in size and starts with 64 bytes to describe the directory name.

 

1. Word file

First, find the "worddocument" directory in the directory, and the corresponding binary is:

  1. U_bits_8 word [23] = {0x57,0x00, 0x6f, 0 x, 0 x, 0 x, 0 x, 0 x, 0 x, 0x00, 0x6f, 0 x, 0 x, 0 x, 0 x, 0x00, 0x6d, 0 x, 0 x, 0x00, 0x6e, 0 x, 0x74}

Note: The size issues here

View the stream entry sector of the directory. Note that if the stream size is greater than or equal to the standard stream size, query the slice chain in the slice configuration table (SAT, if it is smaller than the size of the standard stream, the slice chain is queried in the short slice configuration table (SSAT. After the specified slice is located, perform a certain offset to determine. The simple code is as follows:

  1. // Judge whether. DOC file is encrypted or not.
  2. Int is_encrypted_doc (char * file_path)
  3. {
  4. Ifstream ifs (file_path, ios_base: Binary );
  5. If (IFS)
  6. {
  7. Unsigned int stream_address;
  8. Sid_32 stream_sector;
  9. Int stream_length;
  10. Bits_8 TMP [20];
  11. Header header (IFS); // read msat and sat chain
  12. Directoryentry d_entry (& header );
  13. If (! D_entry.get_stream_address ("worddocument", stream_address, stream_sector, stream_length ))
  14. {
  15. IFS. Close ();
  16. Return file_error;
  17. }
  18. IFS. seekg (stream_address );
  19. IFS. Read (& TMP [0], 20 );
  20. IFS. Close ();
  21. If (TMP [11] & 0x01) return file_encrypted;
  22. Return file_common;
  23. }
  24. Else
  25. Return file_no_found;
  26. }

2. Excel files

The principle of Excel files and Word files is basically the same. It is used to find the "workbook" directory. The configuration in Excel is in the format of "configuration Name Length content". Therefore, if you want to find an encrypted field, you must read it from the front until the encrypted configuration field is read-only, the simple code is as follows:

 

  1. // Judge whether. xls file is encrypted or not.
  2. Int is_encrypted_xls (char * file_path)
  3. {
  4. Ifstream ifs (file_path, ios_base: Binary );
  5. If (IFS)
  6. {
  7. Unsigned int stream_address;
  8. Sid_32 stream_sector;
  9. Int stream_length;
  10. Bits_8 TMP [64];
  11. Header header (IFS); // read msat and sat chain
  12. Directoryentry d_entry (& header );
  13. If (! D_entry.get_stream_address ("workbook", stream_address, stream_sector, stream_length ))
  14. {
  15. IFS. Close ();
  16. Return file_error;
  17. }
  18. IFS. seekg (stream_address );
  19. IFS. Read (TMP, 64 );
  20. Unsigned int COUNT = 0;
  21. While (count + 4 <64)
  22. {
  23. Bits_16 flag = convert_chars_to_bits (TMP [count], TMP [count + 1]);
  24. If (flag! = Filepass)
  25. {
  26. Flag = convert_chars_to_bits (TMP [count + 2], TMP [count + 3]);
  27. Count + = Flag + 4;
  28. }
  29. Else
  30. Return file_encrypted;
  31. }
  32. Return file_common;
  33. }
  34. Else
  35. Return file_no_found;
  36. }

3.ppt files

The setting of the pptfile is annoying. In powerpoint2003, its encrypted field value is 0xf3d1c4df, indicating encryption, but powerpoint2002, indicating that encryption is not performed. This is depressing. Later I found a work und, that is, the field in the encrypted document (0x0ff50000 --> rt_usereditatom .) the values are different, mainly because the encrypted document has more encrypted information. We detect this field to complete encryption detection. Because the field name occupies four bytes, therefore, you can directly search for the binary file. After the binary file is searched, check the next byte. If it is 0x1c, it is common. If it is 0x20, it is encrypted.

Note: All Rights Reserved. If you have any post, please indicate the source.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.