(go) Use clang Address sanitizer directly on Xcode 7

Source: Internet
Author: User
Tags valgrind

The original address: Http://www.cocoachina.com/ios/20150730/12830.htmlWWDC 2015, in addition to Swift 2.0, there is an exciting message: You can directly in the Xcode 7 Use clang address sanitizer. In this article we will discuss this feature in detail, such as how it works and how it is used. This is the topic Konstantin Gonikman proposed.

A case of abnormal danger in C language

In many ways, the C language is a great programming language. In fact, the invention has been more than 40 years old, it still maintains a strong momentum. This is enough to illustrate its greatness. It's not the first (and not the second) programming language I've learned, but it's the first time I've really uncovered the mystery of the computer's running mechanism. And it's the only language I'm still using today.

However, C is also a very dangerous programming language, and many of the pain in the code world is born of it. It creates a lot of weird bugs that other programming languages simply can't describe.

Memory security is a major problem. There is no memory security at all in the C language. Like the following code, it will be compiled normally, and may run normally:

12 char *ptr = malloc(5);ptr[12] = 0;

This code only applies 5 bytes of array space, but writes the data to the 13th byte through the pointer. At this address, hidden data corruption can occur, or it may be safe (for example, the malloc function on the Apple platform always allocates a minimum of 16 bytes, even if you apply for less than 16 bytes of space, so this code runs fine on the Apple platform, but does not rely on this feature of the system). This error code may be less harmful or endless.

The more "smart" language tracks the size of the array and verifies the validity of the subscript when it is manipulated. The same Java code would be more reliable in throwing exceptions. With the exception mechanism, debugging these "magic" problems is much easier. For example, a variable should be 4, but in fact it has a value of 5, and we know that there is a problem with the code that modifies the value of the variable (so that at least we will focus on the program debugging and not stare at the compiler, because it generally does not go wrong). However, using the C language, we can not make assumptions, the bug may be that a piece of code "deliberately" modify the value of the variable caused by, or it may be that a piece of code using a "bad pointer" inadvertently modified the value of the variable.

The whole industry has begun to tackle the problem. For example, clang static code analysis, you can look for specific types of memory security issues from your code. Programs such as Valgrind can detect unsafe memory access at run time.

Address sanitizer is another solution. It uses a new approach, with both advantages and disadvantages. But it's still a powerful tool for finding code problems.

Memory Access Validation

Many of these tools verify the validity of memory access at run time to find problems. The rationale is: When accessing memory, verify the validity of memory accesses by comparing the memory accessed and the actual memory allocated by the program, so that they are detected when a bug occurs, rather than waiting until the side effects are generated.

Ideally, each pointer will contain the data size and location information that points to the memory, so each memory access can be validated against these. There is no specific reason why the C compiler did not include a validation feature at the beginning of the design. But the metadata attached to the pointer causes the program to be incompatible with code compiled by the standard C compiler. This means that the system library cannot be simply used, which is bound to severely limit the use of the system detection code.

Valgrind the solution to the above problem is to run the entire program on the emulator. In this way, you can run the binaries generated by the standard C compiler directly without any additional modifications. Then, when the program is running, analyze and check each block of memory that the program handles. This way it makes it possible to run all programs efficiently, including the system's libraries without making any changes. The cost of doing this is that the speed becomes slow and therefore impractical in some high-efficiency programs. In addition, this approach requires a deep understanding of the meaning of a platform system call,

Only in this way can the memory change state be tracked properly. Therefore, it is necessary to integrate deep into specific host systems. Over the years, Valgrind has no clear plan for Mac support. It does not yet support Mac 10.10 at the time this article was released.

The protective memory allocation benefits from the memory Check tool built into the CPU. It replaces the standard malloc function. When used, the trailing end of each allocated memory is marked as non-writable. An error occurs when the program tries to access the memory behind it. There is a disadvantage to this approach: the hardware has insufficient memory protection accuracy. Memory can only be marked as readable or unreadable on the memory page scale, whereas in modern operating systems, memory pages have at least 4kB of space. This means that each memory allocation consumes at least 8kB of memory: One page of memory is used to store the data, and the other page is used to limit the memory accesses that are out of bounds. This is required even if only a few bytes of memory are requested. In addition, such a practice also leads to small-scale cross-border detection. In order to store the protection of memory for standard malloc, it is necessary to allocate memory to a range of 16 bytes, so if the allocated memory size is not an integer multiple of 16 bytes, the remaining bytes will not be protected.

The memory disinfectant mechanism attempts to handle memory limitations on a smaller granularity. In essence, such a memory allocation protection mechanism is slower, but more practical.

Track restricted memory

Since hardware-level memory protection cannot be used, software must be used to implement it. Because the extra data cannot be passed through the pointer, the trace memory must be done through some sort of global table. This table needs to be able to be read and modified quickly.

The memory disinfectant uses a simple but ingenious approach: it stores a fixed area in the memory space of the process, called the Shadow Memory area. In the terminology of memory disinfectants, a memory marked as restricted is called "poisoned" memory. The shadow Memory area records which memory bytes are poisoned. With a simple formula, the memory space in the process can be mapped to the shadow memory area, that is, every 8 bytes of normal memory blocks are mapped to one byte of shadow memory. In shadow memory, this 8-byte "poison state" is tracked.

Every 8 bytes of memory mapped 8 bits (1 bytes) of shadow memory, we naturally think that the "poison state" per byte of memory can only be tagged by one of the shadow memory. The reality, however, is that the memory disinfectant uses an integer value to record each byte as it tracks the memory state. It assumes that all "poison memory" blocks are contiguous and sequentially from backward, so you can use a byte of shadow memory to represent the amount of "poison" in a normal memory block. For example: 0 means all memory is normal, 1 indicates a problem with the last byte, 2 indicates a problem with the last two bytes, and so on, 7 indicates that there are problems with these bytes. If all 8 bytes are "poisoned", this value is negative. In this way, you can check the memory when you access it. The starting position of the allocated memory is generally not too close, so assuming that "poisoned" memory is contiguous and backward, it does not cause any problems.

With this table structure, the address disinfectant generates additional code in the program to check the read and write operations of the pointer each time, and throws an error in the state of memory poisoning. This feature is integrated in the compiler, not just in the external library and running environment, which brings a number of benefits: each pointer access can be reliably identified and the appropriate memory check is added to the machine code.

Compiler integration also supports some neat tricks, such as protecting local and global variables from being tracked in addition to the memory allocated on the heap. Local and global memory allocations generate some gaps, which can cause overflow if the memory is "poisoned". On this point, the protection of memory allocation is powerless, Valgrind also tired of coping.

Compiler-integrated features also have their drawbacks. In detail, the address disinfectant cannot capture erroneous memory access in the system library. Of course, it is "compatible" with the system library. When using the system library, you can open the memory disinfectant function. For example, you can build a program that links cocoa and run it normally. However, it does not capture the wrong memory access caused by cocoa, nor does it detect the memory allocated when your code calls cocoa.

A memory disinfectant can also be used to catch "use after release" errors. Memory is marked as "poisoned" after it is released and cannot be accessed again. The "Use after free" error can be harmful when memory is reused, because then you break irrelevant data. The memory disinfectant will place the newly released memory in a recycling queue and will not be able to request this memory for a period of time to avoid such errors when reused. Naturally, adding checks for each pointer access is costly. It depends on what the code does, because different types of code access the pointer content in varying frequencies. On average, the memory check will reduce the speed of about 2~5 times, which is quite expensive, but it's not going to make the program unusable.

How do I use it?

Using the address sanitizer on Xcode 7 is simple. When compiling through the command line, you need to add the-fsanitize=address parameter to the clang command call. Here is a test program:

Compile, run via address sanitizer:

The program immediately crash, output a lot of content:

This contains a lot of information, in real-world scenarios, that will be of great help in tracking issues. It not only shows where the error memory is written, but also identifies where the memory was initially allocated. In addition, there are additional information.

Using a memory disinfectant in Xcode is simpler: Edit scheme, click the Diagnostics tab, and select the "Enable Address sanitizer" option. Then you can build, run, and then you can see a lot of diagnostic information.

Additional features: No clear behavior disinfectant

Bad memory access is just one of many "interesting" ambiguous behaviors in C. Clang also provides other disinfectants that can be used to capture many ambiguous behaviors. The following are the instance programs:

1234567     #include     #include     int main(int argc, char **argv) {        int value = 1;        for(int x = 0; x < atoi(argv[1]); x++) {            value *= 10;            printf("%d\n", value);        }    }

To run the code:

The end of the result was somewhat bizarre. There is no doubt that the symbolic shaping value overflow is an ambiguous behavior in the C language. It would be nice to catch this error instead of producing the wrong data. An ambiguous behavior disinfectant can be helpful by passing the-fsanitize=undefined-trap-fsanitize-undefined-trap-on-error parameter to open it:

This does not output additional information like an address disinfectant, but when an error occurs, the program stops executing immediately, and we can easily find the problem by debugging the tool.
The ambiguous behavior disinfectant is temporarily not integrated into Xcode, but you can add compiler flags to the project build settings to use.

Conclusion

Address Sanitizer is a great technology that can help us find a lot of problems in C code. It's not perfect, it can't find all the errors, but it can still provide very useful diagnostic information. Here, I strongly recommend that you try to use it in your own code, and you will find the result that surprises you.

(go) Use clang Address sanitizer directly on Xcode 7

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.