Interpretation of smashing the Stack for fun and Profit

Last Update:2018-07-26 Source: Internet

Author: User

Tags printf

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Stack overflow attacks are the most commonly used means of attack, and the infamous Morris worm is the first virus in the world to use a stack overflow attack, and it destroys 6000 hosts in 1988 years. And in 1996, 8 years later, Alpha one, published in Phrack magazine's paper "smashing the stack for fun and Profit", made the idea of a stack overflow attack public, and since then this attack has turned into a hacker's "entry level."

Because the course arrangement requires reading this paper, it is interpreted piecemeal:

I. Introduction:

In this section, the reason for writing this paper is to clarify what the buffer overflow and how buffer overflow attacks work. The experimental environment in this paper is x86 and Linux system.

Two. How the process is organized in memory:

This section is about the organization of processes in memory. The process is divided into three regions in the memory Central area: The code snippet (Text), the data section, and the stack. From the figure above you can see that the distribution in memory is by low address to high address. In fact, here he just made a simple introduction, the program in memory allocation is much more complex, the following entries detailed (from the "Advanced Programming of UNIX Environment"):

The sections contain the following: Code Snippets: Global Constants (const), string constants, functions, and certain things that can be determined at compile time (initialization): Initialized global variables, initialized static variables (global and local) data segments (uninitialized) (BSS): Uninitialized global variables, Uninitialized static variables (global and local) heaps: Dynamically allocated zones (malloc, new, etc.) Stacks: function stack frames for user programs (including parameters, initialization, and uninitialized local variables, but not static variables, local constants (const)), for call execution of dispatch functions command-line arguments and environment variables, as the name implies, store command-line arguments and environment variables

Note: 1. For code snippets only read and Execute permissions, only read and write permission for the data segment. A segment error occurs when an illegal operation is made against them.

2. The Code snippets and data segments are allocated when the program compiles, and the heap and stack are allocated when the program runs.

2.1 What's a stack & why does we use a stack

This section is about the nature of the stack, which is advanced. But why should we use stacks, because modern computers are designed to require high-level languages, and the main technical features of high-level languages are processes and functions. The function is called when the program is run, which is like a jump instruction, but unlike the jump, when the called function is executed, it goes back to the previous statement to continue execution, which uses the first-in-one-out nature of the stack.

2.2 The Stack Region

A stack is a contiguous block of memory that contains data, the register SP points to the top of the stack, and the bottom of the stack is a fixed address. His size changes automatically as the program runs. The CPU executes push and pop instructions. The stack consists of a logical stack frame, which is pushed when the function is called and is popped when returned. A stack frame consists of the following parts:

1). Return address and parameters of the function

2) Temporary variables: include non-static local variables for functions and other temporary variables generated automatically by the compiler

3) Saved context: Includes registers that need to remain unchanged before and after a function call

For this part of the content in the "self-cultivation of the programmer" A very detailed explanation, excerpts are as follows:

Well, here's a straight example:

example1.c

void function (int a, int b, int c)

{

Char buffer1[5];

Char buffer2[10];

}

void Main ()

{

function (n/a);

}

To see how the function is called, we add the-s parameter to generate the assembly code:

Gcc-s-O example1.s example1.c

In EXAMPLE1.S we see that the calling function function is translated (below is the result of my execution, which is a bit different from the paper, but the meaning is the same):

MOVL, 8 (%ESP)

MOVL, 4 (%ESP)

MOVL $, (%ESP)

Call function

That is, the 3 parameters are first put into the stack and then called function (), when the call command is executed, the next hop command ret stack. The operation in the function () functions is as follows:

PUSHL%EBP

MOVL%ESP,%EBP

Subl $20,%esp

The old EBP is stacked, then the current ESP is propped up to EBP, and then the space of two arrays is allocated.

Note that because there is memory alignment reason here the space is increased by 20 bytes instead of 15 bytes.

So the scenario for this stack frame is as follows:

Three. Buffer overflow

A buffer overflow is caused by an overflow of too much data being loaded into the buffer. How to use this common program error to execute arbitrary code. Look at the following example:

Example2.c

void function (char *str)

{

Char buffer[16];

strcpy (BUFFER,STR);

}

void Main ()

{

Char large_string[256];

int i;

for (i = 0; i < 255; i++)

Large_string[i] = ' A ';

function (large_string);

}

There is a buffer overflow error in the function () above, and he uses the strcpy () function directly, using the strncpy () function to determine the size of the copy byte. If you execute the top program, a segment error will occur. When we call the function, the stack is similar to the following:

Why is there a paragraph error here? Very simple, because we want to copy 256 bytes of ' A ' to 16 byte size buffer, so there is coverage, from the above image can be seen SFP, ret, *str are covered as ' a ', and ' a ' ASIC code is 0x41, So ret is 0x41414141, this address exceeds the program's address range, so is illegal access, there will be a segment error. So a buffer overflow allows us to modify the return address of the function, which we can change the execution process of the program. Let's look back at the case in Example 1 when the function stack is called:

We tried to modify the RET section above to let him execute arbitrary code, the modified code is as follows:

EXAMPLE3.C:

void function (int a, int b, int c)

{

Char buffer1[5];

Char buffer2[10];

int *ret;

RET = buffer1 + 12;

(*ret) + = 8;

}

void Main ()

{

int x;

x = 0;

function (n/a);

x = 1;

printf ("%d\n", X);

}

In the example we use Buffer1 plus 12 to get RET value, function return value is stored in RET, and then let the content of RET plus 8 skip X=1 assignment statement directly execute the printf function. So how do we know how to add 8? Borrow GDB:

You can see that the previous RET value is 0x80004a8, we skip the assignment statement to execute 0X80004B2 directly, the two subtract 8 (should be 10).

Four. Shell code

So now we can modify the return address of the function and change the process of the program execution, so what program do we have to execute? In most cases we just want to start a shell, and then we can execute our command on this shell at random. But what if there is no such code in the program. Of course we developed it ourselves, but this code has to be put there. The answer is an overflow buffer, and you want to rewrite the value of RET to point to the buffer that has our code. In the following case, the stack top in 0xFF, we want to execute the code in the buffer s:

The code that starts the shell is called Shell Code, which is similar to Shell code in C:

SHELLCODE.C:

#include stdio.h

void Main ()

{

Char *name[2];

Name[0] = "/bin/sh";

NAME[1] = NULL;

Execve (Name[0], name, NULL);

}

In order to see how he is implemented in the Assembly, we compile after the use of GDB tracking observation, note must use the-static parameter, otherwise in the actual code will not include the system call EXECVE, it will be loaded in the program in the form of C dynamic Library link.

After summing up, we find that we just need to do the following: a) put the null-terminated string "/bin/sh" somewhere in memory. b) Place the address of the string "/bin/sh" Somewhere in memory, followed by an empty long word. c) Place the 0XB in the register EAX. D) Place the address of the string "/bin/sh" in the register EBX. e) Place the address of the string "/bin/sh" in the register ECX. (Note: The original D and E steps have ebx and ecx reversed) f) Place the address of the empty long word in the register edx. g) Execute instruction int x80.

However, if the EXECVE () execution fails then the program will continue to execute and the core dump is present, so we need to join the system call exit ().

After looking at the assembly code, exit () should do the following things:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More