C Compiler Anatomy _4.4 semantic Check _ external declaration

C Compiler Anatomy _4.4 semantic Check _ external declaration _ Type Structure Construction (2)

Last Update:2015-03-20 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In this section, we will examine the structure syntax tree as shown in the 3rd chapter of Figure 3.3.17 to construct the struct type structure.

Figure 3.3.17 Parsestructorunionspecifier () constructs a syntax tree

In the 2nd Chapter we give the following structure of struct data corresponding to the type structure, shown in 2.4.4. For the sake of readability, we re-present these 2 images, and we can also preview the starting and ending points of this section more clearly by illustration.

struct data{

int abc:8;

int def:24;

Double F;

} DT;

Figure 2.4.4 Type structure of structure

According to the standard grammar of C, depending on the "struct name" and "Curly braces", the structure or the union specifier Structorunionspecifer can be divided into the following 4 scenarios, where the 4th case has been wrong in the parsing times.

(1) struct Data1//with struct name but no "curly brace"

(2) struct {int A; int b;}//unstructured name but with "curly braces"

(3) struct data2{int A; int b;}//struct name also has "curly braces"

(4) struct//unstructured name no "curly brace", parsing error

The related code of the Checkstructorunionspecifier function shown in 4.4.10 is mainly used to deal with the above 4 cases, and the 9th to 19th line is used to deal with "well-known curly braces" in the form of struct Data1. The 9th row of Lookuptag is used to query the symbol table, and if it is not found, call Startrecord on line 12th to create a struct RecordType object shown in 2.4.4, and a Addtag function in line 13th to add a struct to the symbol table Data1 symbolic information, at this time the structure of the STRUCTDATA1 type is not complete, we have only one Structrecordtype object, has not yet formed the complete type structure shown in 2.4.4.

Figure 4.4.10 Checkstructorunionspecifier ()

For the case of "nameless but curly braces", the 20th to 23rd Row is preprocessed, and the 22nd line is set to 1, which means that when the Checkstructorunionspecifier function returns, we can get a complete struct type structure. The work that really constitutes the 2.4.4 type structure is done in line 52nd to 59th. The domain members in the struct are called structdeclaration in the C standard grammar and may be more appropriately named Structfielddeclaration, and line 55th calls Checkstructdeclaration to examine the member fields. The member domain type information obtained from this is stored in the struct filed object shown in 2.4.4. After the type information of each member domain is obtained from the while loop of line 54th, we also need to synthesize the offset information of each member domain in the whole structure, which ultimately determines the layout of the structure object in memory, and the Endrecord function of line 58th completes the work. The code in line 24th to 46th is used to preprocess the "famous and curly braces" case.

In short, the main steps in building a 2.4.4 type structure are:

(1) Call the Startrecord function to construct a Structrecordtype object;

(2) Call the Checkstructdeclaration function for each member domain to create the corresponding Structfiled object;

(3) Call the Endrecord function to calculate the structure's memory layout, mainly the offset position of each member domain.

Next, we discuss the related functions in turn, and figure 4.4.11 gives the relevant generation of the Startrecord function.

Code, line 3rd creates a Structrecordtype object in the heap space, the 5th to 8th line Initializes it, and the 7th line of Rty->flds points to a list of structfiled objects, and the 8th row complete flag bit is 0, Indicates that this is still an incomplete struct type. The member domain type information obtained through the checkstructdeclaration function is stored in a struct filed object, which invokes the AddField function of line 11th to add the struct filed object to the struct The RecordType object.

Figure 4.4.11 Startrecord ()

In C, it is permissible to define a variable-length array at the end of the struct, as shown in the struct packet below, and the hasflexarray of the 17th row of the figure 4.4.11 is used to record whether there is a variable-length array in the struct body. When a member field in a struct contains a const qualifier, the entire struct object is treated as const, the 21st row flag bit HASCONSTFLD is 1, and the 23rd to 28th Line creates a struct filed object and initializes it accordingly. Lines 29th and 30 Add this object to the single linked list pointed to by Rty->flds.

struct packet{

int Len;

Char data[];

};

The Lookupfield function in line 33rd of Figure 4.4.11 is used to retrieve the existence of a member field named ID in the Structrecordtype object, when the member field has no name and its type is also a nameless struct, we need to call the Lookupfield function recursively in line 40th at this time. "Nameless struct", which is suitable for handling dt.b as shown below.

struct data{

struct {//nameless struct, if changed to struct ABC, causes dt.b syntax error

int A;

int b;

}; Nameless Domain member

int C;

}

struct DATA1 dt;

dt.b;

It should be noted that if the "unknown structure" is changed to a well-known, such as a struct ABC, because in the C language, struct ABC is not considered as the "inner class" of the struct data, the scope of the two is the same, This causes dt.b to be treated as a syntax error, at which point the C compiler does not believe that a member domain named B exists in struct data.

Next, let's take a look at the function checkstructdeclaration,4.4.12 that examines the member domain of the struct as shown in. For the "declaration specifier" in the struct member domain, we call the function checkdeclarationspecifiers in line 28th to get the type information, for multiple declarators of the shape 33rd row, We call the function Checkstructdeclarator in line 36th to 39th while loop to process.

Figure 4.4.12 Checkstructdeclaration ()

The code for function checkstructdeclarator is shown in line 1th to 24th, and in line 6th we call Checkdeclarator to get the type information in the declarator, and line 8th by calling Derivetype to "4.4.12". The type information of the two parts of the declaration specifier Declarationspecifiers "and" declarator declarator "is synthesized, finally, the type information of the member field is added to the struct's type structure by the AddField function in line 23rd. Thus constituting the type structure shown in Figure 2.4.4. In the comments in line 12th to 22nd, we give some examples of the error of struct definition, in function checkstructdeclarator, we need to detect these errors, the related code is not too complex, here withheld. As a result, we will eventually continue to build the type structure for the struct member domain in accordance with the trilogy of Checkdeclarationspecifiers, Checkdeclarator and Derivetype.

Since we know the type information of each member in the struct, we can know the size of the memory to be occupied by each member domain, and the member domain is defined sequentially, thus it can calculate its offset position in the structure object, this work is done by the Endrecord function. It's a bit more cumbersome when you encounter a "nameless struct," 4.4.13 the 3rd to 7th lines of code.

Figure 4.4.13 Addoffset ()

In the C language, for the following code, when we use dt.b3, we need to know the offset of B3 in the entire struct data, By line 2nd of Figure 4.4.13, we can find that member domain A is offset to 0, its size is 4, then the offset is 4, and the 6th row of B3 in the nameless struct is 8, the two add, you can get B3 in the entire struct struct data offset of 12. However, for dt.d.d3, we only need to calculate the 13th row of D in the struct data offset, and then calculate the D3 in the 9th line corresponding to the structure of the unknown of the offset can be, in the middle code generation, we first process DT.D, and then processing (DT.D). D3.

struct Data dt;

dt.b3;

DT.D.D3;

The Addoffset function in line 16th of Figure 4.4.13 calculates the offsets of each member domain for the "nameless struct" in the form of lines 3rd to 7th, taking into account that there may be nested "nameless structures" within the "nameless struct", and we need to call Addoffset recursively in the 21st line to calculate the corresponding offsets. For the "nameless struct" in line 9th to 13th of Figure 4.4.13, since member Domain D is well known, we do not need to call addoffset to calculate the offsets of D1, D2, and D3 in the entire struct data.

On this basis, we analyze the Endrecord function, in order to see the overall process of the whole code more clearly, We ignored the handling of structural posture members in line 13th to 21st of Figure 4.4.14, the 6th to 31st line handles the struct type, and the 32nd to 46th Row handles the union type. For a consortium, we want to traverse a list of struct field objects, such as "figure 2.4.4 struct type structure", to find the member that occupies the largest memory space, which is the size of the entire Union object to occupy memory. There can be no variable length data in a union, and this is checked on line 42nd to 46th.

Figure 4.4.14 Endrecord ()

In the case of structs, the size of the memory space occupied by the entire structure object, at least the sum of the memory size of each member domain, is sometimes larger if the member domain alignment is taken into account. For example, in the following struct-struct data, the sum of the memory size of each member domain is 5 bytes, but considering that the int is generally aligned by 4 bytes, the member CH and num will have 3 bytes of white space, and the entire struct data after alignment takes up a memory size of 8 bytes. Figure 4.4.14 the 7th to 26th row of the while loop, which accumulates the memory size of each member domain and offsets the current member field on line 8th, aligned with the Align macro in line 9th. For the "nameless struct" in the structure, we call the Addoffset function discussed earlier in line 11th to calculate the offset of the member domain within the struct body in the "nameless struct".

struct data{

Char ch; Accounted for 1 bytes

int num; Accounted for 4 bytes

};

In the case of struct-struct buffer as follows, since there is only one variable-length array in the entire struct structbuffer, this is considered illegal and we will check this on line 47th to 51st.

struct buffer{

Char buf[];

};

The code corresponding to the comment in line 13th to 21st of Figure 4.4.14 is mainly used to handle the following situations:

(1) 4.4.14 16th line, the current member B1 is not a bit field, at this time, the previous bit field A1 although only 12 bits, but we let it exclusive 4 bytes, that is, an int type of memory size;

(2) 4.4.14 the 18th line, the current member B2 is a bit field, and can be placed in an int type remaining bit space, for example, here A2 takes up 12 bits, 20 bits left, enough B2 storage;

(3) 4.4.14 19th line, the current member B3 is a bit field, but an int type remaining bit space is not enough to store, A2 and B2 accounted for 24 bits, the remaining 8 is not enough to store B3. At this point we can open up a new int type to hold the B3 20 bits of data.

In Short, the trilogy of building type structure checkdeclarationspecifiers, Checkdeclarator, and Derivetype are the key points. On this basis, it is relatively easy to understand the function Checkenumspecifier for checking enum enum and the Checktypedef function for checking typedef custom type names, and we are no longer verbose.

C Compiler Anatomy _4.4 semantic Check _ external declaration _ Type Structure Construction (2)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

C Compiler Anatomy _4.4 semantic Check _ external declaration _ Type Structure Construction (2)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

C Compiler Anatomy _4.4 semantic Check _ external declaration _ Type Structure Construction (2)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support