Shellcode Detection-libemu Principle Analysis

Source: Internet
Author: User

I. Introduction

Libemu is a x86 shellcode-based library implemented in C language.

It supports:

1. parse x86 commands, register simulation, and FPU Simulation

2. Static Analysis, dynamic analysis, WIN32API hook

You can use libemu:

1. Determine if a string is shellcode

2. You can use libemu to obtain the command execution flowchart (similar to IDA and other debugging tools)

Libemu can be used in IDS, honeypot, and other security products.

Ii. Use

The following is an example of using libemu.

/*libemu test*/#include <emu/emu.h>#include <emu/emu_shellcode.h>#include <emu/emu_memory.h>struct emu *emu;char shellcode[] =      "\xbe\x1f\x5e\x89\x76\x09\x31\xc0\x88\x46\x08\x89\x46\x0d\xb0\x0b"      "\x89\xf3\x8d\x4e\x09\x8d\x56\x0d\xcd\x80\x31\xdb\x89\xd8\x40\xcd"      "\x80\xe8\xdc\xff\xff\xff\x2f\x62\x69\x6e\x2f\x6c\x73\x00\xc9\xc3"; int main(){emu = emu_new();if ( emu_shellcode_test(emu, (uint8_t *)shellcode, 48) >= 0 )        {        fprintf(stderr, "suspecting shellcode\n");        }emu_free(emu);return 0;}

In the preceding example, the suspecting shellcode is printed during execution, indicating that this is a string that can be used. In fact, this string is the shellcode in Linux, the completed function is to execute "/bin/ls" in the current path ".

Iii. Implementation Principle

Libemu is based on parsing and simulating the x86 assembly language. Unlike bochs and qemu, libemu is only a simulator, not a virtual machine. You can only perform a simple simulation of memory and CPU, but not a full simulation.

3.1 A basic assumption

A basic assumption of libemu is that if the string is a piece of shellcode, it must contain the "call" (0xe8) or "fnstenv" (0xd9) Command (getpc code ).

This involves shellcode writing skills. Generally, address locating is required in shellcode writing, and address locating is difficult to bypass call/RET or floating point number commands like fnstenv. For example:

Example:

         jmp    0x2a                        popl   %esi                       movl   %esi,0x9(%esi)              movb   $0x0,0x8(%esi)              movl   $0x0,0xd(%esi)              movl   $0xb,%eax                   movl   %esi,%ebx                   leal   0x9(%esi),%ecx              leal   0xd(%esi),%edx            int    $0x80                      movl   $0x1, %eax          movl   $0x0, %ebx                  int    $0x80                      call   -0x2f                       .string \"/bin/ksh\"  

The above assembly actually executes two system calls exec and exit, but during this execution, the address of the string "/bin/KSh" needs to be passed to exec, however, this address is unknown when writing shellcode.

The feature of the "call" command is used here: the EIP is pushed during the call (in fact, the address of the next instruction of the Call Command is put into the stack, in This shellcode, It is the address of the string "/bin/KSh"), and then jump to the "popl % ESI" command. After the command is executed, it is the address that just pushed to the stack, pop is assigned to the ESI register.

Another example: Write another shellcode technique-delta offset. After a shellcode is written, the command and data are in a fixed position. The shellcode is different on the shellcode development machine and the attacked machine in that shellcode has different loading locations (different EIPs ). If the address used by shellcode is hard-coded as the actual address on the development machine, you can calculate the address of the attacked machine by simply knowing the Delta offset. For example:

In this case, the problem is converted to how to obtain the EIP value of the attacked machine. You can do this:
       call delta        delta:             pop ebp 

Or:

       fpu_addr:        fnop        call GetPhAddr        sub ebp,fpu_addr        GetPhAddr:        sub esp,16        fnstenv [esp-12]        pop ebp        add esp,12        ret

3.2 three basic actions: 1. Simulate Memory and CPU memory: Simulate two-level page tables (copy upon writing ). CPU: analog registers, segments, current commands, and descriptions. 2. Static Analysis

Whether the x86 Assembly format is legal:

Sort out the command flow: Perform the command in sequence, jump, and conditional jump to obtain the execution flow chart.

Determine whether the data address of the command exceeds the control range of "shellcode.

3. Dynamic execution

Determine whether the value of registers and memory addresses used by the command during command execution is controlled by shellcode. Here we need to use the command execution flowchart obtained from static analysis.

3.3 determine whether shellcode is the standard:

Can this string be dynamically executed as an assembly language?

4. pseudocode

Emu_shellcode_test (XXX, uint8_t * data, uint16_t size {If (data does not contain "0xe8" and "0xd9") {This is safe; return ;} find the first legal assembly statement from data and perform static analysis; If (Static Analysis Error) // syntax error or data uncontrollable {This is safe; return ;} perform dynamic analysis; If (the Assembly in data can perform n consecutive steps dynamically) {This is suspecting shellcode; return ;}}

5. Limitations

Platform: limited to x86

Performance: relatively low. The NIDs demo provided by libemu has a throughput of 2 Mbps.

False Positive and false positive: "call" may not exist in shellcode"

Detection Method: libeum does not actually detect attacks.

Encoding/encryption: libemu is powerless to encode or encrypt data.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.