Author: Chris lattner
Original: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
Sometimes, people may ask why the Code Compiled by llvm sometimes generates sigtrap signals when the optimizer is turned on. After in-depth analysis, they found that clang generated a "ud2" command (assuming x86 code) -- the same as _ builtin_trap. Here there are several questions, all about the undefined behavior in the C code and how llvm handles it.
This blog post (the first in the three series) attempts to explain some of the problems, so that you can better understand the trade-offs and complexity involved in the post, and may learn more about the dark side of C. It turns out that C is not an "advanced assembly language" that many experienced C programmers (especially those who focus on the underlying layer) think ", c ++ and objective-C both directly inherit a large number of problems from it.
Introduction to undefined Behaviors
Llvm IR and C programming languages have the concept of "undefined behavior. Undefined behavior is a broad topic with many nuances. The best introduction I have ever seen is a post on the John Regehr blog "). The brief content of this excellent article is: many seemingly reasonable things in C actually have undefined behaviors, which is a common source of bugs in programs. In addition, any undefined behavior in C permits the implementation (compiler and runtime) to generate code that formats your hard disk, perform totally unexpected operations, or worse. Again, I strongly recommend that you read John's paper.
C-based languages have undefined behaviors, because C designers want it to become an extremely efficient low-level programming language. On the contrary, languages like Java (and many other "secure" languages) have avoided undefined behaviors because they want to implement secure and reproducible behavior between implementations, and willing to sacrifice performance to get it. Although neither of them is "the correct goal to pursue", if you are a C programmer, you should understand what undefined behavior is.
Before entering the details, it is worth mentioning the necessary conditions for the compiler to obtain good performance from a wide C application, becauseNo magic bullet. At a very high level, through: a) Do a Good Job of basic algorithms, such as register allocation and scheduling; B) know a lot and a lot of "skills" (such as spat optimization, loop transformations, etc.), as long as they are profitable, they will be applied; c) Good at eliminating unnecessary abstractions (such as macro-caused duplication in C, function inline, remove temporary objects in C ++); d) Don't get things done; the compiler generates high-performance applications. Although any of the following optimizations may seem insignificant, it turns out that only one cycle of a critical loop is saved, and some decoders can speed up to 10% or reduce power consumption by 10%.
Benefits and examples of undefined behaviors in C
Before getting into the dark side of undefined behaviors and using llvm policies and behaviors as a C compiler, I would like to consider several specific cases of undefined behaviors, it is helpful to discuss how to achieve better performance than a secure language, such as Java. You can think of this as "activation optimization" by undefined behavior categories, or "avoiding" the "overhead" required for each scenario ". While the compiler optimizer can sometimes eliminate some of these overhead, doing so in general (for each scenario) will require addressing thehalting problem and many other "interesting challenges ".
It is also worth noting that clang and GCC clearly define several behaviors of c standards reserved for undefined. I will describe the standard undefined behavior, and in the default mode, the two compilers process the behavior for undefined behavior.
Use non-initialization Variables: This is generally considered a source of problems in C Programs, with many tools capturing them: From compiler warnings to static and dynamic analyzer. This improves performance by not requiring 0 to initialize all variables (as JAVA does) when entering the scope. For most scalar variables, this will lead to a very small overhead, but the stack array and the memory of malloc will lead to a memset for this storage, the cost will be quite large, this is especially because the storage is usually completely covered.
Signed Integer Overflow: If Arithmetic overflow occurs on a 'int' type (for example,), the result is undefined. In one example, "int_max + 1" is not guaranteed to be int_min. This behavior starts some type of optimization that is important to some code. For example, if you know that int_max + 1 is undefined, You can optimize "x + 1> X" to "true ". Knowing that multiplication does not "overflow (because doing so will be undefined) allows" x * 2/2 "to be optimized to" X ". Although these may seem insignificant, these things are usually exposed through inline and macro exhibitions. This allows a more important Optimization for a "<=" loop like this:
for (i = 0; i <= N; ++i) { ... }
In this loop, the compiler can assume that the loop will actually iterate n + 1 times. If "I" is undefined for overflow, this allows loop optimization intervention. On the other hand, if the variable is defined as a overflow loop, the compiler must assume that the loop may be infinite (if n is int_max, this will happen)-This will disable these important loop optimizations. This particularly affects the 64-bit platform, because so many code uses "int" as the index variable.
It is worth noting that the unsigned overflow is guaranteed to be a 2's complement overflow (bypass), so you can always use them. The price that causes signed integer overflow is defined by the loss of these types of optimization (for example, a common symptom is that there are a large number of signed extensions in a loop on a 64-bit platform ). Both clang and GCC accept the "-fwrapv" mark, which forces the compiler to define signed integer overflow processing (except int_min divided by-1 ).
Too large offset: The offset of A uint32_t 32 or more bits is undefined. I guess this is because the following offset operations have different practices on different CPUs: for example, x86 splits the offset 32 to 5 bits (so the offset 32 bit is equivalent to the offset 0 bits ), however, PowerPC truncates the offset 32 to 6 bits (so the offset 32 bits generate 0 ). Due to these hardware differences, this behavior is completely undefined by C (so the 32-bit offset on powerpc can format your hard disk, itNot GuaranteedGenerate 0 ). The cost of eliminating this undefined behavior is: for Variable Offset, the compiler will have to generate an additional operation (like an 'and') on a common CPU, this will double their cost.
Unreference wild pointers and array out-of-bounds access: Unreferencing random pointers (such as null, pointers to released memory) and accessing an out-of-bounds array are a common bug in C applications. I hope this does not need to be explained. To eliminate this type of undefined behavior, each array access must check the boundary and change Abi to ensure that the boundary information follows any pointer governed by pointer arithmetic. For many numeric values and other applications, this will be a very high price and break the binary compatibility with the existing C library.
Unreference NULL pointer: In contrast to common ideas, it is undefined to reference a null pointer in C. It is not defined as fall in, if you MMAP a page to 0, it is not defined as access to this page. This gives up the rule that disallows the unreferenced wild pointer and uses null as a sentry. The NULL pointer undefined has initiated extensive optimization: on the contrary, Java makes it ineffective for the compiler to move a side-effect action across any object pointer. If the optimizer cannot prove that the pointer is not null. This seriously damages scheduling and other optimizations. In C-based languages, undefined null makes a large number of simple Scalar optimizations possible. These optimizations are developed by macro expansion and inline discovery.
If you are using a llvm-based compiler, You can reference a "volatile" null pointer to get a crash. If this is what you expect, because volatile storage and loading are usually not touched by the optimizer.
Currently, it is not marked to allow any NULL pointer loading to be processed for valid access, or to allow random loading to know that their pointers are "null allowed ".
Violation type rules: It is undefined to forcibly convert an int * to float * And unreference it (access "int", as if it is a "float "). C requires that these types of conversions take place through memcpy: using strong pointer conversion is incorrect, leading to undefined behavior. This rule is quite subtle. Here I don't want to go into details (char *, vectors with special attributes, Union changes, etc., is an exception ). This behavior makes it possible for an analysis called "type-based alias analysis (Tabb)", which is optimized by various memory access in the compiler, and can significantly improve the performance of generated code.
For example, this rule allows clang to optimize this function:
float *P;
void zero_array() {
int i;
for (i = 0; i < 10000; ++i)
P[i] = 0.0f;
}
Is"Memset (p, 0, 40000)". This optimization also allows extracting a large number of loads from the loop and eliminating public subexpressions. This type of undefined behavior can be prohibited by passing in the-fno-strict-aliasing mark, thus disabling this analysis. When this tag is passed in, clang is required to compile the loop into 10000 4-byte storage (several times slower), because it is possible to change the P value at any storage, like this:
int main() {
P = (float*)&P; // cast causes TBAA violation in zero_array.
zero_array();
}
This type of abuse is quite rare, and this is why the Standards Board decides that the "rational" type conversion is worth a significant performance improvement in exchange for unexpected results. It is worth noting that Java has the advantages of type-based optimization without these disadvantages, because there is no unsafe pointer conversion in this language.
In short, I hope this gives you a concept that some types of optimization are initiated by undefined behaviors in C. Of course there are many other types, including violation of the order points like "Foo (I, ++ I)", competition conditions in multi-threaded programs, violation of 'restrict ', and division of 0.
In our next post, we will discuss why undefined behavior in C is a terrible thing if performance is not your only goal. In the last part of our series, we will discuss how llvm and clang handle it.