Yesterday my colleague asked me a question, there are two circular statements:
Copy Code code as follows:
for (i = n; i > 0; i--)
{
...
}
for (i = 0; i < n; i++)
{
...
}
Why is the former faster than the latter?
My explanation at the time was:
I-the operation itself affects CPSR (the current program State Register), CPSR Common flags have n (result is negative), Z (result is 0), C (with Carry), O (with overflow). I > 0 can be judged directly by the Z-sign.
The i++ operation also affects CPSR (the current program State register) but only affects the O (overflow) flag, which is not helpful for the judgment of I < N. So you need an extra comparison instruction, which means that each loop executes one more instruction.
(This is what Tjww told me five years ago, when he wrote an LCD driver on the AVR, using the latter LCD flashes, the former is no problem.) )
To make sure that my understanding was correct, an experiment was done:
Copy Code code as follows:
int Loop_dec (int n)
{
int i = 0;
int v = 0;
for (i = n; i > 0; i--)
V +=i;
return v;
}
int loop_inc (int n)
{
int i = 0;
int v = 0;
for (i = 0; i < n; i++)
V +=i;
return v;
}
Compile with ARM-LINUX-GCC, then disassemble:
i--Cycle conditions:
4c:e51b3014 LDR R3, [FP, #-20]
50:e3530000 CMP R3, #0; 0x0
54:cafffff5 BGT <loop_dec+0x30>
i++ Cycle conditions:
b8:e51b3018 LDR R3, [FP, #-24]
BC:E1520003 CMP R2, R3
C0:BAFFFFF4 BLT <loop_inc+0x30> The results are not the same as I imagined, what's going on here? I think it might be because there is no optimization option, and then with the-o option, the result becomes:
i--Cycle conditions:
14:e2500001 subs R0, R0, #1; 0x1
18:1AFFFFFC bne <loop_dec+0x10>
i++ Cycle conditions:
3c:e2833001 add R3, R3, #1; 0x1
40:E1500003 CMP R0, R3
44:1AFFFFFB bne <loop_inc+0x14> That's right, there's a CMP directive.
Article Source: Http://www.limodev.cn/blog