C Compiler Anatomy _4.2 semantic Check _ Expression semantics Check (6) _ Unary operator expression

Source: Internet
Author: User
Tags scalar

In this section, we discuss the semantic checking of unary operator expressions, as shown in their associated code 4.2.35. For the "pre-Gaga" and "pre-decrement" operators, we take the same strategy as "post-Gaga" and "post-decrement", converting--a to a-= 1 and converting ++a to a + = 1, so the function called in line 5th of Figure 4.2.35 is the function transformincrement () we introduced when we discussed the semantic check of postfix expressions. For expressions like +a or-a, we need to check whether a is an arithmetic type, because the result of the expression +a is stored in a temporary variable, so +a is an rvalue, when the expression +a value is a, and the Adjust () function on line 14th treats the syntax tree node of a as the right value; , if a corresponds to a constant, for example (3.0+4.0), then we can do it at compile time-(3.0+4.0) to get 7.0, and the 18th line of Foldconstant () completes this constant folding operation. If the type of operand a is less than int, then we will raise the integer by the dointegerpromotion () of line 16th, and the type of the expression +a by the 17th line is the type of sub-expression A. For the bitwise inverse operator ~, by the semantic rule of C, the type of operand a in the expression ~a should be integer, and the remaining code in line 21st to 28th is not difficult to understand, and is no longer verbose here.


Figure 4.2.35 Checkunaryexpression ()

And for the logical non-operator!, the corresponding code is in line 29th to 35th of Figure 4.2.35. The type of sub-expression A in expression!a should be a scalar type, that is, an integer, float, and pointer type, whereas an array type is usually considered a vector type, mathematically defined as a coordinate in a multidimensional space (x1,x2,...., xn), and a scalar is just a coordinate x on a one-dimensional space. Struct types are also not scalar types. Figure 4.2.35 the 31st line of Isscalartype () is used to determine whether a type is scalar. Of course, the logical non-!a operation results should be boolean, but there is no Boolean in C language, but with 0 and non-zero, which corresponds to the int type, the 32nd line we set! A corresponding syntax tree node is of type int.

For the unary operator sizeof, the code on line 36th to 42nd of Figure 4.2.35 is used to handle a unary expression such as sizeof (A+B), at which time the operand of the sizeof operator is an expression, but according to the C standard, A bit-field member in a struct cannot be an operand of sizeof, the 39th to 41st Line examines it, and the 43rd to 45th line of code is used to process an expression such as sizeof (int *), at which time the operand of sizeof is a type name. The 44th Line of Checktypename () is used for semantic checking of the type name, and we will parse the function when we discuss DECLCHK.C, whose function return value is a pointer to the struct type object that contains the type information in the Structtype object. For example, how much memory is the variable of this type. The entire sizeof expression should be a compile-time unsigned integer constant, and line 50th to 52nd completes these settings.

Let's take a look at 14.2.35 the Checktypecast () function of line 55th, which is used for semantic checking of a forced type conversion expression such as (int) A, as shown in code 4.2.36. In line 7th of Figure 4.2.36, we call the Checktypename function to perform a semantic check on the type name int (int) A, resulting in a struct type object corresponding to the type name, and line 9th by calling adjust () function to make the necessary type adjustments to operand a. In line 11th to 14th, we quoted a semantic rule from the C standard document Ucc\ansi.c.txt, so that in a forced type conversion expression like (t) expr, the type of T and the type of expr are scalar types and cannot be array types or struct types. Of course t may be void, and the code in line 16th to 19th implements these semantic rules. Line 20th calls the cast function for type conversion, and we have parsed the cast () function in the previous section.


Figure 4.2.36 Checktypecast ()

In Figure 4.2.35, we also need to check the 6th row for the address operation of the form &a, or in line 9th to deal with the "pull" operation of the shape such as *ptr, because the code is relatively complex, we are not given in figure 4.2.35. Next, let's give an example to illustrate the *ptr-like extraction operation, shown in 4.2.37.


Figure 4.2.37 Extraction Operation Dereference

Although in line 8th to 10th of Figure 4.2.37, we are using the * operator to express the C language extraction operation dereference. However, if indirect addressing is required in the generated assembly code, it depends on the type of the operand. In other words, syntactically similar expressions such as **arr,**ptr and **PTR2, due to the difference in type of operands such as Arr, PTR, and ptr2, result in a larger difference in syntax trees generated by semantic checking, and their memory addressing patterns are different.

For example, for **arr = 1 of line 8th of Figure 4.2.37, we do not need any indirection, and its corresponding assembly instruction is the 40th line of MOVL $, arr+0, when linked, ARR is the equivalent of an address constant. The left operand of the assignment operation on line 8th is **arr, and the corresponding memory address is arr+0. So in the syntax tree ([] ([] arr 0) 0), which is **arr after the semantic check, there is no extraction operation *, but the array index operator [], when generating the intermediate code, for the array index operator, we are actually doing the addition, as long as the arr+0+ 0, you can get the address of the left operand. The basis behind our decision is that the type of arr corresponds to the array type int [3][4], not the pointer type.

By line 6th of Figure 4.2.27 We know that PTR2 is a pointer-type int * * and is not a pointer to an array. For **ptr2 = 3 of line 10th of Figure 4.2.37, we need to do two indirect addressing to get the memory address of the left operand, which is shown in the assembly code 4.2.37 line 43rd to 45th. In line 43rd, the contents of the PTR2 corresponding memory unit are sent to the register eax, the 44th row takes the register eax the contents of the memory cell, and in the register ECX, here we do a register indirection; the 45th line saves the immediate number 3 to the memory unit pointed to by ECX. Here we have another register of indirect addressing. The syntactic feature of the register indirection at the AT/T assembly code, such as (%EAX), means that the address of a memory unit is stored in the register EAX. After the semantic check, the syntax tree corresponding to **PTR2 is the 24th line (* (* ptr2)), at this time the * operator appears on the syntax tree, indicating that we do need to carry out the extraction operation, the assembly instruction through the indirect addressing to achieve the extraction operation.

For **ptr = 3 of line 9th of Figure 4.2.37, we know from the 3rd and 4th lines that the type of PTR is a pointer to "array int[4]". After the semantic check, the syntax tree we generated for **ptr is ([] ([] ptr 0) 0), which is similar to the **arr syntax tree ([] ([] arr 0) 0) after the semantic check. Because PTR is a pointer type to an array, and ARR is an array type, we generate different code for ([] ([] ([] ptr 0) 0) and ([] [] ([] arr 0) 0) when intermediate code is generated. Thus, the assembly code corresponding to the **PTR is 41st and 42 lines, and the 41st line of "Movl Ptr,%eax" is used to send the contents of the PTR corresponding memory unit to the register EAX, and the 42nd line of "mov $ (%eax)" is indirectly addressed via registers, The immediate number 2 is fed into the memory unit pointed to by EAX, where we do an indirection operation.

In short, for expressions like *p, "if p is an array type, or *p is an array type", after parsing we get a syntax tree (* p), but after the semantic check, we construct the syntax tree for *p ([] p 0). A little generalization, for the form of an expression like * (P+i), after parsing the corresponding syntax tree is (* (+ P i)), "If p is an array type, or * (P+i) is an array type", we can convert it to p[i] to handle. After the semantic check, we construct the syntax tree as ([] p k), where K is i*sizeof (*p). For example, for INTARR[3][4], the syntax tree for expression * (arr+1) After semantic checking is ([] arr 16), where 16 originates from 1*sizeof (*arr), which is 1*sizeof (Int[4]). The function checkunaryexpression, as shown in code 4.2.38, related to the dereference operator.


Figure 4.2.38 Checkunaryexpression_case_op_deref

    diagram 4.2.38 the code for line 14th to 22nd is used to put the expression * ( arr+i) The corresponding Syntax tree (* (+  arr  i)), converted to ([] arr  k) to process, because we are post-order traversal syntax tree, so the calculation of K by i*sizeof (*arr) will be performed at the access operator + node, and the end of access + operator node, we will access its parent node, the * operator node here. The 31st to 39th line of code is used to convert the syntax tree (* ptr) of an expression *ptr to ([]  ptr  0). The 10th to 13th line of code is used to convert an expression such as * (&a) to A, and the 27th to 30th line of code is used to convert a function call of the form (*F) () to F ().

Next, let's take a look at the address operator &. Let's combine a simple example to illustrate the subtle semantic differences in the names of arrays in different contexts. For example, for an array of int arr[3][4], the identifier that is stored in the symbol table is the type of arr that is an array type int [3][4]. In the expression (arr+1), arr corresponds to the type of the syntax tree node, which, in the Checkprimaryexpression function, is placed as an int [3][4] type after the symbol table is looked up. Since the ARR node is the left operand of the two-tuple operator +, when we perform a semantic check on the two-tuple operator, we call the Adjust () function in the Checkbinaryexpression function to type-adjust the Zuozi so that the type of the ARR node is from int [3][4] adjust to int (*) [4], which is what we usually say in the C language, "the array name arr represents is the first address of the No. 0 element of the array arr[0]." Strictly speaking, this is not accurate enough, in the symbol table, the type of the symbol arr is always the array type int [3][4], as to the type of the syntax tree node that arr corresponds to, the expression context in which Arr is located, and in (arr+1), the type of the ARR node is adjusted to int (*) [4 ]。 In the expression &arr, however, by the semantics of C, we do not need to adjust the type of ARR node-that is, you do not need to call the Adjust () function in the UCC compiler, so the expression &arr is of type int (*) [3][4], which points to "array int [3][4 ] "pointer type. On the machine to do a small experiment will find arr+1 and &arr+1 value is not the same.

printf ("%p%p%p\n", arr,arr+1,&arr+1);

Experimental results 804a060 804a070 804a090

The reason for this is that the type of arr node in the expression Arr+1 is adjusted to be int (*) [4]. According to the semantics of the pointer operation, the T-type pointer ptr carries out (ptr+1) operations, meaning that it refers to an object of type T, which means that at the assembly level of code, the actual execution of the addition operation is (PTR + sizeof (T)). The type T is int[4], and sizeof (Int[4]) is 16, which is written in hexadecimal as 0x10. For arr in the expression &arr+1, its subexpression is &arr, and the subexpression &arr is of type int (*) [3][4], where T is int [3][4],sizeof (t) is 48, corresponding to the hexadecimal 0x30. If the first address of the array arr is 0x804a060, the value of Arr+1 is 0x804a070, and the &arr+1 value is 0x804a090. As a result, we have discussed the array names for further explanation when we combine section 1.5 with the C language compendium.

With the FETCH address operator & its associated code 4.2.39 as shown. As you can see, in line 6th of Figure 4.2.39, we did not call the Adjust () function for type adjustment. Line 8th to 11th is used to convert an expression like & (*ptr) to PTR, because & (*PTR) is an rvalue, so we have to deal with the converted PTR node as an rvalue, so the 10th row lvalue field is 0. The 12th to 18th line is used to convert the syntax tree of &arr[i] to

(+ arr k) to process, because is a post-order traversal, by I i*sizeof (*arr) to get the operation of K, already in the processing subtree Arr[i] is completed. Only when operand A is an lvalue can we perform &a operations, and the lvalue means that the unit is addressable for C programmers. If Node A in &a corresponds to the function name, since we have set the Lvalue domain of the node to 0 in the Checkprimaryexpression function, we need to make a special judgment on line 19th. Of course, according to the semantics of C, C programmers can not be in the structure of the bit domain members and variables declared as register to take address operations, the 20th line of this is judged. Because the expression &a is an rvalue, the 23rd line sets the Lvalue field of the &a corresponding node to 0, and the 24th line completes its type.


Figure 4.2.39 Checkunaryexpression_case_op_addr

At this point, we have completed the semantic check of the unary operator expression, and in subsequent chapters we will examine the two-tuple operator expression for semantics.

C Compiler Anatomy _4.2 semantic Check _ Expression semantics Check (6) _ Unary operator expression

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.