Execution Stream Protection for Win10 security features
0x00 background
Microsoft announced the Windows 10 technical preview version on September 10, January 22, 2015. Build No.: 9926. The Computer Manager Anti-Virus Lab immediately analyzed the new security features it introduced.
As we all know, If attackers want to execute malicious code During exploits, they need to destroy the normal execution of original program commands. The purpose of execution Stream Protection is to detect the normality of the instruction stream during the execution of the program. When exceptions do not meet the expectation, exception handling is performed in a timely manner. There are already some mature technical solutions for implementing Stream Protection in the industry. In the latest version of Windows 10 released by Microsoft, we have seen the wide use of this protection concept.
0x01 CFI
CFI is the Control Flow Integrity Control-Flow Integrity. It mainly uses dynamic rewriting of binary executable files to provide additional security protection.
This is an example of using Mihai Budiu to introduce CFI technology. Here, we can rewrite the binary executable file to insert a verification ID agreed upon during rewrite before the jmp destination address, in jmp, check whether the data in front of the target address is the agreed verification ID. If not, enter the error handling process.
Similarly, you can rewrite call and ret:
The left part is a call rewrite, And the right part is a ret rewrite. The verification ID is inserted before the call destination address and the ret return address, then, check for the verification ID is added to the changed call and ret. If the verification ID does not meet the expectation, go to the error handling process. This approach is exactly the same as that for jmp.
0x02 CFG
CFI implementation requires a register (or indirect addressing using registers) in jmp and call. The destination address must be dynamically obtained and overhead of rewriting is high, these problems have caused some difficulties for the practical application of CFI.
In the latest operating system win10, Microsoft adopted the CFG Technology for the actual application based on execution flow protection. CFG is the abbreviation of Control Flow Guard, which is the Control Flow protection. It is a combination of compilers and operating systems to prevent untrusted indirect calls.
During vulnerability attacks, common exploitation methods are to overwrite or directly tamper with the value of a register, tamper with the indirect call address, and then control the program execution process. CFG records all indirect call information during compilation and linking, records them in the final executable file, and inserts additional verification before all indirect calls, when the indirect call address is tampered with, an exception is triggered and the operating system is involved in processing.
Taking the Spartan html parsing module of IE11 in win10 preview 9926 as an example, let's take a look at the detailed situation of CFG:
Let's take a look at it through dynamic debugging.
We can see that the actual runtime address is different from the address we can see through IDA static. This involves the CFG part related to the operating system. When the operating system loader that supports the CFG version loads a module that supports the CFG version, it replaces the address with a function address in ntdll. The operating system of the CFG version is not supported. Ignore this check and retn is used directly when the program is executed.
This is the detection function in ntdll.
The principle is to assign the register value (or indirect addressing with offset) of the call to ecx before entering the detection function, and record the data during compilation in the detection function, to verify whether the value is valid.
The detection process is as follows:
First, read a bitmap from LdrSystemDllInitBlock + 0x60. This chart shows which function addresses are valid and uses three bytes of the indirectly called function address as an index, obtains a DWORD Value of the bitmap where the function address is located. A total of 32 bits indicate that 1 bits represent 8 bytes. However, generally, the indirectly called function addresses are 0x10 aligned, therefore, odd digits are generally not used.
Obtain the DWORD Value of A Bitmap Using the 3 bytes in the function address as the index, and check whether the 0-3 bits of the 1 byte are 0. If the value is 0, it is proved that the function is 0x10 alignment, and a total of 5 bits in 3-7 bits are used as the index of the DWORD value, so that the corresponding bits in the bitmap can be found through a function address. If it is set to a bit, the function address is valid. Otherwise, an exception is triggered.
Here is an interesting thing. Although test cl and 0Fh are used to check whether 0x10 is aligned, 3-7 digits are actually used as the index if alignment occurs, that is, the 3rd bits must be 0. However, if the function address is not 0x10 aligned, it will be 3-7 bits or 1 and then used as an index. In this way, there is a drawback. If an effective indirect call function address is 8-byte aligned, it actually allows an 8-byte misaligned call, the possible result is that, although the verification is successful, the actually called address is not the Function address recorded in the original record.
In addition, if the vulnerability is triggered successfully at this time, the indirectly called register value has been modified by the attacker. In this case, the value of bitmap may cause invalid memory access. See the LdrpValidateUserCallTargetBitMapCheck character.
This command at "mov edx", "dword ptr" [edx + eax * 4] "edx" is the bitmap address, and "eax" is the index. However, if eax is untrusted, this is very likely, this will cause memory access exceptions, and this function does not handle exceptions. This is because, for efficiency consideration, Microsoft (after all, this verification function is called very frequently, and a module that enables CFG may have tens of thousands of calls). Microsoft is running ntdll! RtlDispatchException handles the exception of this address:
If the abnormal address hits LdrpValidateUserCallTargetBitMapCheck, it is processed separately. RtlpHandleInvalidUserCallTarget verifies the DEP status of the current process and the Memory attribute of the address (ecx) to be indirectly called, if the current process has disabled DEP and the address to be indirectly called has executable properties, the CFG exception is triggered. Otherwise, the EIP is corrected to the ret return location by modifying pContext, it also indicates that the exception has been handled.
Finally, let's talk about the original bitmap. during system initialization, the memory manager will create a Section (MiCfgBitMapSection32) during initialization ), the size of this Section on Win8.1 is calculated by MmSystemRangeStart (0x80000000 in 32 bits). As mentioned above, one bit in bitmap represents 8 bytes, And the size after calculation is exactly 32 MB.
After Section is created, each process is mapped
(NtCreateUserProcess-> PspAllocateProcess-> MmInitializeProcessAddressSpace-> MiMapProcessExecutable-> MiCfgInitializeProcess)
As shared view during ing, unless a process modifies the memory.
When a CFG module is mapped in, the Guard Function Table in the LOADCONFIG file of the PE file will be re-parsed during the relocation process to recalculate the corresponding bitmap (MiParseImageCfgBits) of the module ), finally, update it to MiCfgBitMapSection32 (MiUpdateCfgSystemWideBitmap ).
0x03 execution flow protection for XP Protection
The vulnerability attack code in earlier years can execute commands directly in the stack space or heap space. However, in recent years, Microsoft's operating system has been gradually enhanced in terms of security, and DEP, ASLR, and other protection methods have been applied, this allows attackers to exploit vulnerabilities by using methods such as ROP. Stack switching Command Stack failover is essential in the use of the ROP.
The defense against ROP attacks has long been a difficult problem in vulnerability defense, because the analysis of the ROP commands on the static layer is no different from the normal instruction stream of the program, and the operation is also performed in the legal module, therefore, it is extremely difficult to defend against attacks.
The butler vulnerability defense team analyzes the performance of the whole program at the execution flow level based on the characteristics of the usage of the ROP, and studies whether it is a valid command flow or an abnormal Command Flow in the dynamic running time zone, its ideas coincide with CFI.
Below is a commonly used stack switch command formed by misaligned assembly.
In this case, if the attacker obtains the stack switch command location for executing the ROP attack based on static analysis, the execution of the stream protection logic will discover exceptions, and subsequent attacks will not be able to be implemented.
0x04
The CFG protection method requires the support of the operating system at the compilation link stage. CFI does not need help during compilation. It not only defends against call calls, but also protects all execution streams. However, CFI needs to insert a large number of detection points, and the frequency of detection during execution is extremely high, which will inevitably affect the program execution efficiency.
Compared with the first two, the Defense Method of Computer Manager XP has less impact on performance. However, this method is designed for the earlier version of the operating system, and its versatility will be compromised. Therefore, windows users are recommended to upgrade to the latest operating system to enjoy comprehensive security protection. Users who cannot upgrade for some reason do not have to worry, Butler XP will continue to provide the highest security protection capabilities.