After reviewing the compilation principles, yundun issued a question like this:
SourceProgram:
1 Void F ( Int A, Int B, Int C, Int D, Int X, Int Y, Int Z)
2 {
3 While ( < B)
4 {
5 If (C < D)
6 X = X + Z;
7 Else
8 X = Y - Z;
9 }
10 }
The maximum optimization compiled with GCC 3.3.5 is:
1 _ F:
2 Pushl % EBP
3 Movl % ESP, % EBP
4 Movl 8 (% EBP), % ECx
5 Movl 12 (% EBP), % edX
6 CMPL % edX, % ECx
7 Jge L8
8 9:
9 CMPL % edX, % ECx
10 Jl Maid
11 L8:
12 Popl % EBP
13 RET
14
Ask which optimizations are used.
I can see this question on the spot. Now the compiler is too Nb, so only one while loop can be optimized.So on...
1 9:
2 CMPL % edX, % ECx
3 JlMaid
So let's take a look At GCC 3.4.4, but the result is worse:
1 _ F:
2 Pushl % EBP
3 Movl % ESP, % EBP
4 Movl 8 (% EBP), % ECx
5 Movl 12 (% EBP), % edX
6 CMPL % edX, % ECx
7 Jge L8
8 Movl 20 (% EBP), % eax
9 CMPL % eax, 16 (% EBP)
10 Jl L15
11 9:
12 CMPL % edX, % ECx
13 Jl Maid
14 L8:
15 Popl % EBP
16 RET
17 L15:
18 CMPL % edX, % ECx
19 Jge L8
20 CMPL % edX, % ECx
21 Jl L15
22 JMP L8
23
The jump structure is very strange... and the optimization is not clean. C <D is a comparison. What can the higher version of GCC be made? Let's take a look At GCC 4.3.2 (20080827 beta), and the results are quite perfect... after the while loop is passed, even the condition judgment is saved... directly JMP prepare
1 _ F:
2 Pushl % EBP
3 Movl % ESP, % EBP
4 Movl 12 (% EBP), % eax
5 CMPL % eax, 8 (% EBP)
6 Jl L5
7 Popl % EBP
8 RET
9 L5:
10 JMP L5
So I want to see what vs2008 can do every day, so I used Cl 15.0 to optimize it. The results are as follows:
1 _ F proc
2 MoV Eax, dword ptr _ A $ [esp- 4 ]
3 CMP Eax, dword ptr _ B $ [esp- 4 ]
4 Jge Short $ Ln3 @ F
5 Npad 6
6 $ Ll4 @ F:
7 JMP Short $ ll4 @ F
8 $ Ln3 @ F:
9 RET 0
10 _ F endp
It's extremely fierce ~~~~~~ About the npad 6 that appears in it, I checked what it meant by surfing the Internet. It turns out to be a macro for latency:
1 Npad macro size
2 If size EQ 1
3 NOP
4 Else
5 If size EQ 2
6 MoV EDI, EDI
7 Else
8 If size EQ 3
9 ; Lea ECx, [ECx + 00]
10 DB 8dh, 49 H, 00 h
11 Else
12 If size EQ 4
13 ; Lea ESP, [esp + 00]
14 DB 8dh, 64 h, 24 h, 00 h
15 Else
16 If size EQ 5
17 Add Eax, DWORD PTR 0
18 Else
19 If size EQ 6
20 ; Lea EBX, [EBX + 00000000]
21 DB 8dh, 9bh, 00 h, 00 h, 00 h, 00 h
22 Else
23 If size EQ 7
24 ; Lea ESP, [esp + 00000000]
25 DB 8dh, 0a4h, 24 h, 00 h, 00 h, 00 h, 00 h
26 Else
27 % Out Error: Unsupported npad size
28 . Err
29 Endif
30 Endif
31 Endif
32 Endif
33 Endif
34 Endif
35 Endif
36 Endm
It is hard to understand why we should use this macro... speculation may be related to some of the script-level parallel features of the current CPU, such as excessive concurrency and multi-launch...
In addition, if you write another main function, just call f () in it and return. Both compilers fl and ignore this call. Despised... tears...
Conclusion: The current compiler is really Nb !!
P.s. Thanks to Stephen for his inspiration and technical guidance!