In the previous introduction to semantic inspection, we have initially introduced the function checkbinaryexpression for the semantics of the two-tuple operator expression, and for the convenience of reading, here we give the figure 4.2.2 again. In this section, we are going to discuss each function in line 1126th to 1144th.
Figure 4.2.2 Checkbinaryexpression ()
For a two-dollar operation expression like a+b, we want to find the type of the entire expression a+b by the function Commonrealtype described in the previous section, if a is of type int and B is a double type, the type of the expression A+b is double. In the macro definition perform_arith_conversion given in Figure 4.2.40, line 14th calls the Commonrealtype function for the public type, while the 15th and 16th rows perform the necessary transformation operations on the left and right operands of the two-tuple operator, respectively. The 6th line of macro Swap_kids is used to exchange the left and right operands, for example, when encountering an expression like i + PTR, where I is an integer, and PTR is a pointer, if we want the left operand to be a pointer type, and the rvalue operand to be an integer type, you can use the macro swap_kids to exchange the left And then get ptr+i. The report_op_error of line 19th is used for an error, indicating that an invalid operator was encountered.
Figure 4.2.40 Macro Perform_arith_conversion
Next we analyze the functions such as CHECKEQUALITYOP, which are listed in line 1126th to 1143th of 14.2.2, as shown in code 4.2.41. Line 2nd we give a macro isarithtype that determines whether the type is arithmetic, which is defined in ucl\type.h, and the 3rd row of Isscalartype is used to determine whether the type is a scalar type. In conjunction with enumeration constants such as the 3rd row of pointer and their enumeration definitions in type.h, it is not difficult to read type.h other macros, such as Bothscalartype, which are no longer verbose. The code on line 5th to 24th of Figure 4.2.41 is used for semantic checking of expressions such as a==b or a!=b, and if operands A and B are both arithmetic types (that is, integer and floating-point types), the necessary type conversions are done with the macro perform_arith_conversion in the 12th. The result of the expression a==b or a!=b is true or false, in C, we use the int type to represent the Boolean value, so the type of the 13th row expression is int. The 14th line calls the Foldconstant function to do the necessary constant folding, for example, 3 = = 2 Such an expression, it is not necessary to evaluate the runtime, compile time we can know that the expression 3 = = 2 result is 0.
Figure 4.2.41 Checkequalityop ()
The 16th to 21st line of Figure 4.2.41 is written in accordance with the semantic rules set forth in "equality operators" of the C standard Ucc\ansi.c.txt 3rd. 3.9, as follows:
(1) Both operands is pointers to qualified or unqualified versions ofcompatible types;
Two pointers are type-compatible pointers, and we've covered the concept of "compatible types" in the previous chapters. Corresponds to line 16th.
(2) One operand is apointer to an object or incomplete type and the other is a qualified orunqualified version of void; One pointer is a pointer to the data object, and the other is void *. Corresponds to lines 17th and 18.
(3) One operand is apointer and the other is a null pointer constant. One is a pointer type and the other is null. Corresponds to lines 19th and 20.
diagram 4.2.41 line 25th to 31st for handling a&b, a Bitwise operations such as |B and a^b, which require operands to be integral, while line 32nd to 39th is used for processing shapes such as a && B and a| | b Such a short-circuit operation, the IF condition of the 34th line requires that both operands are scalar types (that is, integer, float, and pointer types, etc.); The code for line 41st to 54th is used for semantic checking of A/b, a*b, and A%b, and the remainder operation% requires both operands to be integral, while multiplication operations can be integer or floating-point The 55th to 61st line of functions is used to check for shift operations such as A<<B and a>>b, and it is important to note that we do not use macro perform_arith_conversion to find public types because of the expression a> Whether the >b is an arithmetic right shift or a logical right shift depends on whether the left operand A is a signed integer or an unsigned integer, so in line 60th we place the type of the expression a>>b as the first operand a. The code in Figure 4.2.41 is not complicated, and we are no longer verbose.
Next, let's analyze the functions such as CHECKADDOP, shown in 4.2.42. The code in line 1th to 26th is used to handle expressions such as a+b, while the 27th to 48th line of code makes a semantic check on an expression of the form a-B. For A+b, line 9th to 11th is used to handle cases where the two operands are of arithmetic type, when one operand is a pointer type and the other operand is an integer, the expression ptr+i is a C language pointer addition operation, and the addition to the assembly code level really executes is PTR + K, where K is i* sizeof (*PTR), the Scalepointeroffset function called by line 21st is used to construct a syntax tree node for multiplication. For a-B, the 30th to 33rd line of the two operands used to handle subtraction are arithmetic types, and line 34th to 39th is used to handle pointer operations like Ptr-i, similar to Ptr+i's pointer operations, We need to call the Scalepointeroffset function on line 37th to construct the multiplication of i*sizeof (*ptr). Line 41st to 48th is used to handle pointer subtraction operations such as PTR1-PTR2, for example, for Intarr[3], &arr[2]-&arr[0] has a difference of several int integers between &arr[2] and &arr[0]. Instead of their address a few bytes apart, so at the assembly code level, we really perform the operation for (PTR2-PTR1)/sizeof (*arr), the 44th line called the function pointerdifference used to construct the corresponding division operation node. Line 49th to 60th gives the code for the function, and the 53rd Create_ast_node is used to create a syntax tree node, and the 55th line has its operator Op_div, which is division.
Figure 4.2.42 Checkaddop ()
Let's discuss the semantic check of an assignment operation expression. According to the semantics of C, in the expression a+=b, a is computed only once, and in a = A+b, A is computed two times. For example, in the following code, in the expression *f () + = 3, the function f () is called only once, and in the expression *f () + = *f () + 3, the function f () needs to be called two times. Therefore, when we convert a + = B to a = a + B to deal with, we must keep the semantics unchanged.
int * F (void) {
static int number;
printf ("int *f (void) \ n");
Return &number;
}
int main () {
*f () + = 3;
*f () + = *f () + 3;
return 0;
}
For expressions in C source code that are written as a = a + B (for ease of presentation, it may be remembered as A1 = A2 +b), in the syntax tree, there are two nodes corresponding to a, that is, A1 and A2 each corresponding to a syntax tree node; but for a+=b, only one syntax tree node corresponds to a, and semantic checking, we put a+= When B is converted to A = a ' + B to process, it does not construct a new syntax tree node for a. In order to be different from the expression written as a=a+b in C source code, we might as well write the expression of a+=b after the semantic check as a= a ' +b, where A and a ' correspond to the same syntax tree node.
With this foundation, let's analyze the function checkassignmentexpression for semantic checking of assignment operation expressions. The operand to the left of the assignment operator must be an lvalue, and the operand should be declared without a qualifier const, and if the left operand is a struct object, then in the definition of that struct, all member fields should not have a const qualifier, and the canmodify () of line 38th to 44th of Figure 4.2.43 function to complete these judgments. Line 13th calls the Canmodify () function to check if the left operand is a writable left-hand value. Line 18th to 27th is used to convert an expression such as a + = B to a = A ' +b, we only create a new syntax tree node in line 20th to hold the operator +, and a and a ' always correspond to the same syntax tree node. For a = A ' +b, the 26th line actually calls the CHECKADDOP function to perform a semantic check on the addition operation. Line 29th calls the Canassign function to detect whether the operands on either side of the assignment operator match on the type, and we analyzed the Canassign function in the previous section.
Figure 4.2.43 Checkassignmentexpression ()
Figure 4.2.44 gives the syntax tree of expression a+=b after parsing and semantic checking, which we can clearly see, after converting a+=b to A=a ' +b, a corresponding node in the syntax tree is still only one. After the semantic check, we'll also add the type information for each node on the syntax tree, which we've ignored for the sake of simplicity.
Figure 4.2.44 Syntax Tree of a+=b
Then, let's discuss the semantic check of conditional expressions such as A?B:C, which is related to code 4.2.45. Conditional expressions can actually be treated as ternary operators, which are 3 operands of a, B, and C. Figure 4.2.45 Line 6th to 12th is used for semantic checking of the first operand A, line 9th requires that the first operand A is a scalar type, and the 13th to 18th line recursively calls the Checkexpression function to perform a semantic check on the 2nd operand B and 3rd operand C respectively.
Figure 4.2.45 Checkconditionalexpression ()
The code in line 19th to 50th of Figure 4.2.45 is used to examine the types of operands B and C in the form A?B:C, which are written according to the semantic rules of the section "3.3.15 Conditional operator" of the C standard Ucc\ansi.c.txt. As shown below:
(1) Both operands have arithmetic type, that is, B and C are arithmetic types, corresponding to the first
22 to 28 rows, at which point the entire condition expression is of type B and C for the public type.
(2) Both operands have compatible structure or union types, that is, the type of B and C are compatible
struct or union type, corresponding to line 29th to 30th.
(3) Both operands has void type, that is, B and C are of type void, corresponding to the 31st to 32nd
Row, the type of the entire conditional expression is void at this time.
(4) Both operands is pointers to qualified or unqualified versions ofcompatible types,
That is, B and C are compatible pointer types, corresponding to line 33rd to 36th, where the type of the entire conditional expression is the type B and C of the composition type.
(5) One operand is a pointer and the other is a null pointer constant, which is a pointer class
Type, and the other is null, corresponding to line 37th to 42nd.
(6) One operand is a pointer to an object or incomplete type and the Otheris a pointer
To aqualified or unqualified version of Void, which is a pointer to a data object (that is, not a pointer to a function) and the other is void *.
In addition to these 6 cases, the other types of B and C are considered illegal, and the 48th Guild of Figure 4.2.45 error. The functions called Commonrealtype and Compositetype in Figure 4.2.45 have been discussed in the previous chapters. At this point, we have completed the semantic check of the C language expression, the related code is mainly in the EXPRCHK.C, you can find that the suffix expression and the semantics of the unary expression is relatively complex, for example, the suffix operator [] and the unary operator * both involve memory addressing. The semantic checking of these operators helps us to understand the C language more deeply.
C Compiler Anatomy _4.2 semantic Check _ Expression semantics Check (7) _ Two-tuple operator _ assignment Operation _ conditional expression