Analysis of C + + performance (i.)

Last Update:2014-08-21 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Performance problems are not just "technology" can be solved, it is often the framework, testing, assumptions and other comprehensive problems. However, for an engineer, must start from childhood, some "obvious" small problems to solve. Otherwise, the plot is much smaller, thousands of miles of levees, Yixue.

Why is C + + 's performance always after C (see the latest test results for websites like http://benchmarksgame.alioth.debian.org/u32/performance.php?test=binarytrees)? I think this is a 3-part reason:

1) The C + + compiler for testing does not use the latest optimization techniques

2) The added value of C + + is not considered in the test

3) The "subtlety" of the C + + application layer (which can refer to my other blogs about C + +) makes the general program shy and chooses "textbook use cases" so that some side effects are not removed at the application level.

Remember, more than 10 years ago, when I was developing in Microsoft, I consulted with the first C + + compiler author Lieberman Stan Lippman (then Microsoft VC + + architect) A series of our team's C + + performance challenges, and with his help, we used technologies such as INLINE,RVO in key locations, Completely solve the performance problem, but also find out the VC + + several small errors. I realize that most of the performance problems of C + + is that we have a shallow understanding of C + + and most of them are not difficult to solve.

Here's an example of a comparison to see how subtle details affect program performance.

struct Intpair

{

int ip1;

int opt;

Intpair (int i1, int i2): ip1 (I1), IP2 (I2) {}

Intpair (int i1): ip1 (I1), IP2 (I1) {}

};

Calc sum (usinh value semantic)

Int Sum1(Intpair p)

{

return p.ip1 + p.ip2;

}

Calc sum (usinh ref semantic)

int Sum2(Intpair &p)

{

return p.ip1 + p.ip2;

}

Calc sum (Usinh const ref semantic)

Int Sum3(const intpair& p)

{

return p.ip1 + p.ip2;

}

Above this simple struct, there are three Sum functions, doing exactly the same thing, but is the performance the same? We use the following procedure to test:

Double Sum (int t, int loop)

{

using namespace Std;

if (t = = 1)

{

clock_t begin = Clock ();

int x = 0;

for (int i = 0; I < loop; ++i)

{

x + = SUM1 (Intpair);

}

clock_t end = Clock ();

return double (end-begin)/clocks_per_sec;

}

else if (t = = 2)

{

clock_t begin = Clock ();

int x = 0;

Intpair p (n);

for (int i = 0; I < loop; ++i)

{

x + = SUM1 (p);

}

clock_t end = Clock ();

return double (end-begin)/clocks_per_sec;

}

else if (t = = 3)

{

clock_t begin = Clock ();

int x = 0;

Intpair p (n);

for (int i = 0; I < loop; ++i)

{

x + = Sum2 (p);

}

clock_t end = Clock ();

return double (end-begin)/clocks_per_sec;

}

else if (t = = 4)

{

clock_t begin = Clock ();

int x = 0;

Intpair p (n);

for (int i = 0; I < loop; ++i)

{

x + = SUM3 (p);

}

clock_t end = Clock ();

return double (end-begin)/clocks_per_sec;

}

else if (t = = 5)

{

clock_t begin = Clock ();

int x = 0;

for (int i = 0; I < loop; ++i)

{

x + = SUM3 (10);

}

clock_t end = Clock ();

return double (end-begin)/clocks_per_sec;

}

return 0;

}

We used 5 cases, the SUM1 and Sum3 wind do not use two methods of invocation, the Sum2 used a call mode. We tested 100,000 calls:

Double sec = Sum (1, 100000);

printf ("Sum1 (use ctor) time:%f \ n", sec);

SEC = Sum (2, 100000);

printf ("Sum1 (use no C ' Tor) time:%f \ n", sec);

SEC = Sum (3, 100000);

printf ("Sum2 time:%f \ n", sec);

SEC = Sum (4, 100000);

printf ("Sum3 without conversion time:%f \ n", sec);

SEC = Sum (5, 100000);

printf ("Sum3 with conversion time:%f \ n", sec);

We tested in Visualstidio , as a result:

Use Case 1 18ms

Use Case 2 9ms

Use Case 3 6ms

Use Case 4 7ms

Use Case 5 12ms

In other words: Use cases 1 and 5 are the slowest, others are basically no different.

The attentive reader is not hard to see,

1) The performance problem with use Case 5 is because SUM3 uses the C + + implicit conversion to automatically convert integers to Intpair. This is an application-level problem, and if we have to convert integers, we have to pay for this performance.

2) The problem with use Case 1 is similar to 5 because you have to create a temporary variable every time.

3) Use case 2 with VC + + input argument optimization optimization, exempt from the use of copy constructor, but I do not know whether all the compilation used this optimization. This optimization makes use case 2 performance is not bad with use case 3 how much.

4) The use Case 3 performance is stable, but it uses the "indirect" way (see my blog about reference for details), so the instruction generated is more than 2 two articles of use case. However, the impact on performance is small, and is estimated to be related to Intel's directive pipeline.

5) Use case 4 and use case 3 generate code exactly the same, there should be no difference. Const is only useful at compile time, and the generated code is not related to const or not.

The topic of performance is too much, this article is only dragonfly water, but has touched the C + + 's two biggest performance pitfalls:

Temp variable
Implicit conversion (silent conversion)

2014-6-20 Seattle

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Analysis of C + + performance (i.)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Analysis of C + + performance (i.)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support