On the expression evaluation in C + + language _c language

Source: Internet
Author: User

You can often see the following question in some discussion groups: "Who knows what value the C statement gives n?" ”

m = 1; n = m+++m++;

Recently an unknown friend sent me an email asking why in a C + + system, the following expression prints two 4 instead of 4 and 5:

A = 4; cout << a++ << A;

C + + is not a rule << operation left union? is C + + book wrong, or the implementation of this system has a problem?

Note: Run a = 4; cout << a++ << A;

As in Visual C + + 6.0, you get 4 and 4, and in Visual Studio, you get 4 and 5.

Which one is right? Please look at the following analysis!

One question to understand about this is: If a variable is modified somewhere in the program (by assignment, increment/decrement, etc.), when will the new value be fetched from the variable? Someone might say, "What's the problem?" I modified the variable, and then take the value from this variable, of course, is the modified value! "It's not that simple," he said.

The C + + language is an expression-based language , and all calculations (including assignments) are done in the expression. "x = 1;" is the expression "x = 1" plus the semicolon that represents the end of the statement. To understand the meaning of the program, first of all, the meaning of the expression, that is: 1 The expression of the calculation process determined, 2 it to the environment (can see the environment as all the variables available at the time) impact. If an expression (or subexpression) evaluates only a value without altering the environment, we say that it is a reference to transparency, an expression that does not affect other computations (without changing the computing environment). Of course, its value may be affected by other calculations). If an expression not only calculates a value, but also modifies the environment, it says that the expression has a side effect (because it does a lot of extra work). a++ is an expression that has side effects. These statements also apply to similar problems in other languages.

Now the question becomes: if there is a side effect of an expression (part) in a C + + program, when will this side effect actually be reflected in use? To make the problem clearer, we assume that there are snippets of code in the program ... a[i]++ ... a[j] ... ", assuming that the value of I and J is exactly equal (A[i] and a[j] just refer to the same array element), assuming that a[i]++ is actually evaluated before a[j, and that there are no other modifications in between a[ I] action. Under these assumptions, can a[i]++ 's a[i] changes be reflected in the evaluation of A[J]? Note: Since I and j are equal to the problem cannot be statically determined, in the target code, the two array element access (access to memory) must be done through two separate pieces of code. The computation of modern computers is done in registers, and the question now becomes: Before the code that takes the A[j] value is executed, is the A[i updated value already saved to memory (from the register)? The answer to this question is clear if you understand the language's provisions in this regard.

programming languages usually specify the latest implementation time (called a sequence point, order point, or execution point) in which variable modifications are performed. A series of sequential points exist in program execution

(moment), the language guarantees that once execution arrives at a sequential point, all modifications (side effects) that occurred prior to this must be implemented (must be reflected in subsequent access to the same storage location), and none of the changes after that have occurred. There is no guarantee between the order points. The concept of sequential points is particularly important for languages with side effects that allow expressions to be expressed.

Now the answer to the above question is clear: if there is a point of order between a[i]++ and A[j, then it is guaranteed that a[j] will get the modified value;

C + + language definition (reference manual for language) clearly defines the concept of the order point. The order points are located at :

1. At the end of each full expression. Full expressions include variable initialization expressions, expression statements, return statement expressions, and control expressions for conditions, loops, and switch statements (for headers have three control expressions);

2. Operator &&, | |,?: and the first arithmetic object of the comma operator is computed;

3. After the evaluation of all actual and function name expressions (functions that need to be invoked may also be described by expressions) in the function call (before entering the function body).

Assuming that Ti and ti+1 are two sequential points before and after, to the ti+1, any C + + system (VC, BC, etc. are C/s + + systems) must realize all the side effects after ti. Of course, they can also not wait until the moment ti+1, you can choose at any time between [T, Ti+1] to achieve the side effects during this period, because the C + + language allows these choices.

The previous discussion assumed that a[i]++ was done before A[i]. Whether a[i]++ is done first in a program fragment is also related to the computational process determined by the expression in which it is located. We are familiar with the rules for precedence, binding, and parentheses in C + + languages, and the Order of computation in which multiple objects occur is often overlooked. Look at the following example:

(A + B) * (C + D) Fun (a++, B, a+5)

Which of the two operands of the "*" Here is first counted? Fun and its three parameters are calculated in what order? It doesn't matter if the first expression is in any calculation order, because the subexpression in it is all reference transparent . In the second example, the argument expression has side effects, and the order of calculation is very important. A few languages specify the order in which computing objects are computed (Java rules are left to right), and C + + does not specify the order of calculation of the two objects for most of the two-dollar operation (except for the &&, | | And, nor does it specify the order in which the function parameters and the adjusted functions are calculated. When evaluating the second expression, the fun, a++, B, and a+5 are first sorted in some order, followed by the order point, and then into the function execution.

Many books are wrong on these issues (including some very popular books). For example, C + + is first counted to the left (or right), or to a C + + system to calculate a certain side first. These statements are all wrong! A/C + + system can always be counted to the left or always first to the right, but also can sometimes calculate the left sometimes first to the right, or in the same expression sometimes first counted to the left sometimes first counted to the right. Different systems may be in different order (because they all conform to the language standard); Different versions of the same system can be used in different ways; the same version may be in different order in different optimization modes. Because these practices are consistent with the language specification. Here also note the problem of order point: even if the expression on one side of the first, its side effects may not be reflected in memory, so the other side of the calculation has no effect.

Back to the previous example: "Who knows what the following C statement gives n what value?" ”

m = 1; n = m++ +m++;

The correct answer is: do not know! The language does not prescribe what it should be, and the result depends entirely on the specific context in which the specific system is handled. It involves the calculation sequence and the realization time of variable modification. For:

cout << a++ << A;

We know it is

(Cout.operator << (a++)). Operator << (a);

's Shorthand. First look at the outer function call, where you need to work out the function used, you also need to calculate the value of a. The language does not stipulate which is to be counted first. If the function is really first, there is another function call in this calculation, there is a sequence point before the function body is executed, then the side effect of the a++ will be realized. If it's the first parameter, find the value of a

4, and then the side effects of the function will certainly not change it (in this case output two 4). Of course, these are just assumptions, and the practical thing to say is that this stuff shouldn't be written at all, and it doesn't make sense to discuss its effects.

One may say, why do people design c/A + + without the order of clarity, remove these problems? The practice of C/s + + language is entirely intentional, and its purpose is to allow

The compiler uses any order of evaluation so that the compiler can adjust the sequence of instructions that implement the expression evaluation as needed to get more efficient code in the optimization.

The ordering and effect of expressions, as strictly defined in Java, not only limits the way language is implemented, but also requires more frequent memory access (for side effects) that can result in considerable efficiency losses. It should be said that, on this issue, the C + + and Java choices have been implemented in their respective design principles, each has been (C + + potential efficiency, Java clearer procedural behavior), of course, has been lost. It should also be noted that the majority of programming languages actually adopt a similar requirement as C + +.

So much has been discussed, what conclusions should be drawn? The C + + language rules tell us that any expression that relies on a particular order of computation and that relies on implementing a modification between sequential points is not guaranteed. the rule to be implemented in programming is that if there are multiple references to the same "variable" in any "complete expression" (which forms a calculation at the end of a sequence point), then the side effects of the "variable" should not appear in the expression. Otherwise there can be no guarantee of the expected results. Note: The problem here is not a question to try in a system because we cannot experiment with all possible combinations of expressions and all possible contexts. The language is discussed here, not an implementation. All in all, never write this expression, or we will have trouble in some kind of environment sooner or later in the evening.

PostScript: last year to attend an academic conference, saw a peer to write articles to discuss the expression of a C system in what order to evaluate the value, and summed up some "laws." We learned from the discussion that a "Programmer's proficiency test" had such a problem. This makes me feel very uneasy. This year to teach a teacher class, found that many professional teachers are also not very clear on this basic problem, but also feel that the problem is indeed serious. Therefore, this essay is sorted out for your reference.

The above is a simple discussion of C/s + + language in the expression of the value is small series to share all the content, hope to give you a reference, but also hope that we support the cloud habitat community.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.