The precedence and the binding of operators in C language are often confused, the purpose of this paper is to talk about the difference between them briefly. Here are a few simple examples of how these operators are especially common.
The first thing to understand is that the precedence determines the precedence of the various operators in the expression, while the binding determines the binding direction of the expression when the adjacent operator has the same precedence.
[Assignment operator = "]
For an assignment operator, a continuous assignment of an expression is often used. such as "A=b=c".
Both sides of the variable B here are assignment operations, and the precedence is of course the same, so how do you understand the expression? We know that the assignment expression has a "right-associative" attribute, which means that the semantic structure of the expression is "a= (b=c)" rather than "(a=b) =c". This means that the C-to-b assignment is completed first, and then the value of the expression "B=c" is then assigned to a. This distinction is particularly important! Because it may involve coercion type conversion, the initial value is different, so different understanding of the answer is not the same.
Here we look at the general two-dollar operator, for the sake of convenience, we may now remember as @. If it is "left-bound", then the expression "[email protected]@z" meaning should be "([email protected]) @z", if it is "right-associative", then the expression is "[email protected] ([ Email protected]) ". It is worth noting here that the two-dollar operator may not be the same operator, as long as there is an equal priority, the above conclusion is applicable. For example, "a*b/c" is the expression "(a*b)/C".
[Increment operator "+ +" and dereference operator "*"]
In this section we take the example "*p++". Here's what's supposed to be a bad street. Example code for implementing the strcpy function:
char* strcpy( char* dest, const char* src ){
char*p = dest;
while(*p++ = *src++);
return dest;
}
We soon found that the key to understanding this small procedure was how to understand the meaning of the cyclic condition "*p++".
First, the dereference operator "*" has a lower priority than the subsequent increment operator "+ +", so the expression is semantically equivalent to "* (p++)" instead of "(*p) + +". Semantically, the parentheses are superfluous, and of course the readability of the program is to suggest parentheses.
Another problem that is often confusing is the semantics of the self-increment operator "+ +". Many books write "post-increment is the first value, then add 1". There's nothing wrong with this, but in some specific contexts it's easy to be free, like this while statement above.
When you start to learn, there must be this confusion: when an expression contains self-increment, dereference, assignment, and ultimately as the condition of the control loop, where is the "pre-value" in the end "first" to what extent? At this point we need to look at the C language standard. The following excerpt from the C99 Standard: ISO/IEC 9,899:1999:
6.5.2.4-2:the result of the Postfix + operator is the value of the operand. After the result was obtained, the value of the operand is incremented. ...... The side effect of updating the stored value of the operand shall occur between the previous and the next sequence point.
That is, the result value of the post-increment expression is the value that was before the increment, and the value of the operand is automatically increased after the result value is determined. This "self-increment" side effect is done between the previous sequence point and the next sequence point.
This article does not intend to discuss the sequence points in detail. Interested readers can read the standard. It should be noted that the assignment operation is not a sequence point in the C language, so the self-increment effect of SRC does not have to be done before the assignment in the above while statement. But the end of the entire control expression of the while is a sequence point.
So we can read "while (*p++=*src++);" : First the condition variable of the while is an assignment expression, the left operand is "*p++", the right operand is "*src++", and the value of the entire expression is the value of the left item after the assignment is complete. And the left and right side is the two after self-increment expression dereference, by the previous description can be known that the solution refers to the entire post-increment expression and not only the P or SRC itself, then according to the criteria referenced above, they "take" the person is the pointer p and SRC current values. The self-increasing side effects only need to be done before the next sequence point.
Simply put, the compiler obtains the current value of the pointer p and SRC respectively, and completes the assignment of "*src" to "*p" based on this value, and the result of this assignment will also be used as the value of the entire assignment expression to decide whether to exit the loop. Then, at the end of the entire expression, at some point (without affecting the previous narrative), p and SRC do not add 1.
That is, we judge the values of the old values of P and SRC based on the assignment and the cyclic condition, and then we complete the self-increment of P and SRC.
In addition, there are two other representations of post-increment (post-decrement) operations, although not entirely consistent with the C language standard, but in the final semantic effect:
(1) After the self-increment "x + +" equivalent to a comma expression: "tmp=x,++x,tmp";
(2) After the increment is to add the operand 1, and then return the value before plus 1 as the value of the entire expression.
It is worth mentioning that in the C + + language in the need to overload the self-increment operator, often the mechanism is based on these two statements.
Another example is the realization of a street that is said to be rotten:
size_t strlen(const char* str){
const char* p = str;
while(*p++);
return p - str - 1;
}
We found that the function ended up with a minus 1 operation, because when the loop condition is not satisfied and exits the loop, it will be preceded by a "formal" exit, followed by the increment operator "+ +" plus 1 side effects. It can be understood that the so-called "exit loop" refers to "no longer executes the loop body", but the control expression is not part of the loop body, and all its side effects take effect before the end of the entire expression.
At the end of this section, the important thing to say again: *p++ is * (p++), there is no difference except readability. It is wrong to think that adding parentheses can achieve the idea of adding 1 and then dereference, and to achieve that effect, you can use "*++p".
[Trinocular operator "?:"]
Let's give an example:
int x = 3;
int y = 2;
int z = x > y ? 100 : ++y > 2 ? 20 : 30;
We'll be concerned about how much Z is worth.
Here is the nesting of two three-mesh operators, with a "right-associative" feature. Many people think that based on this nature, the right-hand inner-layer conditional operation "++Y>2?20:30" should be evaluated first. That is, Y plus 1, greater than 2 of the condition, so that the expression to obtain the result of "20", and then the entire expression, then the value of Y is 3, so "X>y" is false, so the entire result is just obtained 20.
But that's not the case ... This way of thinking is wrong!!!
The mistake here is to completely confuse priorities, combinations, and order of evaluation.
First, in most cases, the C language does not have strict rules for the evaluation order of the individual sub-expressions in the expression, and secondly, even when the order of evaluation is determined, the semantic structure of the expression must be determined first, and then the "Order of evaluation" is discussed after the definite semantics are obtained.
For the above example, the conditional operator "right-associative" attribute does not determine that the inner-layer conditional expression is evaluated first, but rather determines that the semantic structure of the expression above is equivalent to "x>y?100: (++y>2?20:30)" rather than "(x> y?100:++y) >2?20:30 ". This is the true meaning of the "right combination."
Once the compiler has determined the structure of the expression, it can accurately produce runtime behavior for it. The conditional operator is one of the few operators in the C language that explicitly specifies the order of evaluation (there are also three other logical and "&&", Logical, or "| | |" and the comma operator ",").
C language provisions: The conditional expression first evaluates the condition part, if the condition part is true, evaluates the part before the colon of the question mark (expression 2), and evaluates the result as the value of the entire expression, otherwise evaluates the part after the colon (expression 3) and acts as the value of the entire expression.
Therefore, for the expression "x>y?100: (++y>2?20:30)", first see whether X is greater than Y is true, in this case it is established, so the value of the entire expression is 100. That is, expression 3 is not executed at all, and the side effects of the self-increment operator that it contains do not take effect.
[Say a few words at last]
This paper mainly expounds the following points:
(1) The precedence determines the precedence of the various operators in the expression, while the binding determines the binding direction of the expression when the adjacent two operators have the same priority;
(2) After self-increment (post-decrement) from the semantic effect can be understood as after the self-increment (self-reduction), the return of the value before the increment (self-subtraction) as the result of the whole expression;
(3) to be exact, precedence and binding determine the semantic structure of the expression and cannot be confused with the order of evaluation.
Ps.
1, this article reference blog: http://blog.csdn.net/steedhorse/article/details/5903974
2. on Wikipedia, there is a table of C/+ + language operators:Http://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B
3, on Sina Weibo see Benbearchen mentioned some companies in the code specification requirements: If the loop body of the while is an empty statement, then must be replaced by a continue statement, not only a semicolon is not allowed to write. I am quite in favor of this. The two examples of strcpy and strlen above are not so used, just to "flow with the crowd" because of the example implementations of these two functions, which are written by many people, many books.
Precedence and binding of [C-language] operators