The first chapter: lexical traps
The compiler is responsible for decomposing the program into a part of a symbol, commonly referred to as the "lexical analyzer." For example, for statements:
if (x = = big) Big = x;
Its first symbol is the C keyword if, followed by the next symbol is the left parenthesis, the next symbol is the identifier x, the next is the greater than sign, the next is the identifier big, and so on. In C, whitespace between symbols (including spaces, tabs, newline characters) is ignored, so the above statement can also be written as:
If
(
X
==
Big
)
Big
=
X
;
It is also necessary to emphasize the "C language ignores inter-symbol whitespace". The first: "symbol" does not mean "letter", for example, "= =" and "Big" in the above statement, they all contain multiple letters, but they are a whole symbol, indivisible. Second: Ignore the white space between the symbol and the symbol, meaning that after the lexical analysis of the compiler, the white space between the symbol and the symbol will not be used as a symbol. However, please note that the 2nd does not mean that the space can be disorderly, for example, in the above statement, "= =" can not write "= =", if written in the latter, it becomes two assignment number, the meaning of the program has changed.
1.1 = different from = =
The former is the assignment, the latter is the judgement of equality, for C language beginners, it is easy to make mistakes of misuse.
1.2 & | differs from && and | |
The first two are bitwise operations, and the latter two are relational operations.
1.3 "Greedy method" in lexical analysis
Some symbols in the C language, such as/, *, and =, have only one character, which is called single-character notation. Other symbols in the C language, such as/* and = =, as well as identifiers, include more than one character, called a multi-character symbol. Then when the C language compiler lexical analysis module read a character '/' followed by a character ' * ', then the compiler must make a judgment: it is treated as two separate symbols, or together as a symbolic treatment. The C-language solution to this problem is simple: "Greedy law," that each symbol should contain as many characters as possible.
It is important to note that, in addition to string and character constants, the middle of a symbol cannot be embedded with white space (spaces, tabs, and line breaks). For example, "= =" is a single symbol, while "= =" is two symbols. For example, the following expression:
A---B
Then according to the "Greedy law", its meaning is:
(a--)-B
and
A---b
The meaning is:
A-(--b)
Another example of the following statement, which seems to be the meaning of x divided by the value of P points, the resulting quotient is then assigned to Y:
y = x/*p/* p points to divisor */
In fact, because of the greedy method, the */* behind the X in the above statement is understood by the C language lexical parser as the beginning of a comment.
The correct expression is as follows:
y = x/*p/* p points to divisor */
Or a little clearer:
y = x/(*p)/* p points to divisor */
1.4 Integral Type Constants
021 is different from 21, the former is octal, the latter is decimal.
Practice 1-1 Write a test program, whether it is a compiler that allows nested annotations, or a compiler that does not allow nested annotations, the program can compile normally, but the results of the execution of the program are different in both cases.
answer :/*/*/0*/**/1
In the case of allowing nested annotations, the upper value is 1;
In cases where nested annotations are not allowed, the upper value is 0*1;
C defects and pitfalls----reading notes---chapter I