Code optimization in cyclic summation

Source: Internet
Author: User

In the demonstration before the embedded class this morning, I mentioned optimization in the cyclic sum (in fact, it was just the one I encountered when I searched online the day before ). The examples in the demo are as follows:

Int sum = 0; for (int I = 0; I <100; I ++) {sum + = array [I];} ************************** ** int sum1 = 0, sum2 = 0; for (int I = 0; I <100; I + = 2) {sum1 + = array [I]; sum2 + = array [I + 1];} int sum = sum1 + sum2;

At that time, I found on the Internet that the second method is better, for two reasons: first, two irrelevant operations in the loop body can be processed in parallel, reducing the running time; the second is the number of loops (from the Assembly level, it is the conditional jump), which reduces the number of times, because the conditional jump only knows where the code will jump to at the last moment.


After the demonstration, I was asked by the teacher how much optimization can be achieved in the second method. Have you tested this code?

Finally, no.

So after I came back, I tested and verified it with a larger number of cycles. The Code is as follows:

#include 
 
  #include 
  
   int main(){    DWORD start_time,end_time;    int sum,i;    start_time=GetTickCount();    sum=0;    for(i=0;i<1000000000;i++)        sum+=i;    end_time=GetTickCount();    printf("%d\n",end_time-start_time);    sum=0;    int sum2=0,sum3=0;    start_time=GetTickCount();    for(int i=0;i<1000000000;i+=2)        sum2+=i,sum3+=i+1;    sum=sum2+sum3;    end_time=GetTickCount();    printf("%d\n",end_time-start_time);}
  
 

Running result:

5594
3328

It can be seen that the second method can indeed achieve considerable performance optimization. Now, the question is, is it because the first reason plays a major role, or is it the second reason?

I modified the code of the second method. Code 2 is as follows:

    sum=0;    start_time=GetTickCount();    for(int i=0;i<1000000000;i+=2)        sum+=i+i+1;    end_time=GetTickCount();    printf("%d\n",end_time-start_time);

Running result:

5422
2953

Frequent tests show that code 2 is indeed a little faster than the second method in Code 1. Therefore, I personally feel that parallel processing optimization is not used. That is to say,In specific circumstances, you can achieve considerable performance optimization by reducing the number of conditional jumps.


Next I will try to perform-O compilation optimization. The result is as follows:


-O1 optimization can reduce the time! However, it may be that the logic of this Code is too simple. In addition, I do not know why the running result after-O2 optimization is abnormal... Please advise if you have any reason!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.