No BUG caused by C function declaration written on X86_64

Source: Internet
Author: User

No BUG caused by C function declaration written on X86_64

My blog: http://blog.striveforfreedom.net

Table of Contents
  • 1 Overview
  • 2. Code that causes the crash and Solutions
    • 2.1 code that causes a crash
    • 2.2 Solution
  • 3 Summary
1 Overview

Recently, to modify an open-source program written in C, several functions need to be added because the program crashes due to laziness and no function declaration. It took a lot of time to find out the cause, it turns out that no function declaration is caused. I think this BUG is quite representative on X86_64, so I will record it here.

2. Code that causes the crash and solutions 2.1 code that causes the crash

The code that causes the crash is simplified in the following way:

// Foo. c # include <stdlib. h> # include "bar. h "static const char * value = NULL; void set_value (const char * p) {value = p;} const char * get_value () {return value ;} int main (int argc, char * argv []) {char p [] = "abcd"; set_value (p); failed_func (); return 0;} // bar. c # include "bar. h "// For simplicity, bar is not provided. h. The file contains the failed_func statement. Char failed_func (void) {const char * p = get_value (); return * p; // process crash}

Every time the program executes the failed_func function, it will crash in the line of comment.

2.2 Solution

At first glance, these functions are very simple, and there is no problem at all. Why does it cause a crash? Use gdb to place a breakpoint on the failed_func function, and then step into the get_value function. The returned value is the value set with set_value. However, when the get_value function returns the value of p, it is strange to find that the value of p is not the value set at the beginning. A simple function call has such a strange result, at that time, there was nothing wrong with the C language, so I checked the assembly code and used set disassemble-next-line on to enter the get_value function again, it is found that the rax value in the register is directly returned. The rax value is the value set with set_value. This function is obviously normal (the return value of this function is included in rax ). Return to the failed_func function, and then call the callq command of the get_value function. The cltq command is used to expand the value of eax (sign-extend ), the result is in rax, which causes the high 32-bit value of rax to be set to full 1 or all 0 (depending on the maximum value of eax ), the subsequent command is to access the memory indicated by rax. This command directly causes a crash, because the rax value is no longer set by get_value (in this example, the 32-bit high of rax is set to 1 ). The key here is the cltq command. Why does gcc generate such a command? The reason is that C89 has an implicit declaration Rule (implicit declaration). When you need to call a function but cannot find the function prototype, the compiler provides an implicit declaration, this implicit declaration assumes that the type of the function return value is int, and C99 has removed this rule, requiring that the function call must have a function declaration, but gcc may be compatible with the old code, c99 is not enforced, but a warning is given. In our example, gcc cannot find the prototype of the function get_value, so we assume that the type of Return Value of the function get_value is int, because int on X86_64 is 32-bit and pointer is 64-bit, therefore, assigning the return value of the get_value function to the pointer p is equivalent to assigning a 32-bit signed number to a 64-bit unsigned number (the pointer value is unsigned ), C language specifies that when the types on both sides of the value expression are different, the type on the right of the equal sign will be converted to the type on the left of the equal sign (of course, it can be converted ), therefore, the 32-Bit Signed int is converted into a 64-bit unsigned number, so the compiler generates the symbol extension command cltq. This code will not crash on X86, because the int and pointer on X86 are both 32-bit, And the compiler will not generate symbol extension commands.

When designing the sample code above, there is also a small trick. When designing the code for the first time, I defined the parameter passed to the set_value function in the main function as follows:

const char* p = "abcd";

However, in this case, the program will not crash because the String constant is usually put together with the code snippet, generally, the code segment is loaded at a lower memory address (usually smaller than 0x10000000). Therefore, before the cltq command is executed, the rax value is 32-bit higher than 0, after the execution, the high 32-bit rax is still 0, and the rax value is not changed, so the program will not crash. Later I thought that the stack is generally located at a high memory address, so I changed the code:

char p[] = "abcd";

Because the stack address is usually greater than 0x10000000, after the cltq command is executed, the high 32-bit value of rax is 1, and the rax value represents a large virtual address, access will cause a segment error. Please refer to my other article: Analysis of the Cause of Segmentation fault on Linux & X86.

3 Summary

In fact, this BUG can be completely avoided. during compilation, gcc gave a very obvious warning: initialization makes pointer from integer without a cast, this warning shows the essence of the problem-using an integer to initialize the pointer. After the-Wall option is added, a warning message indicating that the function has not been declared is displayed: warning: implicit declaration of function 'get _ value'. If you can see the two warnings, the problem can be solved immediately. I usually write a program and add-Wall and-Werror to the compilation options. This time I modified the open-source program, I was lazy and did not write the function declaration. In addition, this open-source program has generated too many warnings, as a result, the warning that the compiler could not find the function declaration was drowned in the large number of warnings and was not noticed at all. It took a lot of time to find out the cause. The lesson for me is that, in any case, we must always write a function declaration, so we must not ignore the warning. We must eliminate the warning from the very beginning. Otherwise, there will be more warnings, it's hard to eliminate the warning.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.