Efficiency of code Execution

Source: Internet
Author: User

Efficiency of code Execution

In the performance tuning strategy, I said, to tune the need to find the program in the hotspot, which is called the most places, such a place, as long as you can optimize a little bit, your performance will improve quality. Here I give you three examples of the efficiency of code execution (they all come from the Internet)

A first example

efficiency of getter and setter in PHP (source Reddit)

This example is relatively simple and you can skip it.

Consider the following PHP code: we can see that, using Getter/setter, performance is more than one-fold more than directly read-write member variables.

1234567891011121314151617181920212223242526272829303132333435 <?php    //dog_naive.php    class dog {        public $name = "";        public function setName($name) {            $this-&gt;name = $name;        }        public function getName() {            return $this-&gt;name;        }    }    $rover = new dog();        //通过Getter/Setter方式    for ($x=0; $x<10; $x++) {        $t = microtime(true);        for ($i=0; $i<1000000; $i++) {            $rover->setName("rover");            $n = $rover->getName();        }        echo microtime(true) - $t;        echo "\n";    }        //直接存取变量方式        for ($x=0; $x<10; $x++) {        $t = microtime(true);        for($i=0; $i<1000000; $i++) {            $rover->name = "rover";            $n = $rover->name;        }        echo microtime(true) - $t;        echo "\n";    }?>

This is not sparse, because there is the cost of function calls, function calls need to stack out the stack, need to pass the value, and sometimes need to interrupt, there are too many things to do. So, the code is much, the efficiency is naturally slow. All languages are this virtue, which is why C + + should introduce inline. And Java can be optimized when the optimization is turned on. But for dynamic languages, it becomes a bit difficult.

You might think that it would be better to use the following code (Magic Function), but it actually has worse performance.

123456789 class dog {    private $_name = "";    function __set($property,$value) {        if($property == ‘name‘) $this->_name = $value;    }    function __get($property) {        if($property == ‘name‘) return $this->_name;    }}

The efficiency of dynamic language is always a problem, if you need PHP to have better performance, you may need to use Facebook's hiphop to compile PHP into C language.

A second example

Why do python programs execute faster inside a function? (source StackOverflow)

Consider the following code, one in the body of the function, and one in the global code.

Code execution efficiency within a function is 1.8s

1234 defmain():    for i inxrange(10**8):        passmain()

Code execution efficiency is 4.5s outside the function body

12 fori inxrange(10**8):    pass

Without too much time, just an example, we can see a lot of efficiency. Why is that? We use the dis bytecode code in the module disassembly function body, using the compile Builtin disassembly global bytecode, we can see the following disassembly (note where I highlight)

Main function Disassembly
123 13 FOR_ITER                 6 (to 22)16 STORE_FAST               1 (i)19 JUMP_ABSOLUTE           13
Global Code
123 13 FOR_ITER                 6 (to 22)16 STORE_NAME               1 (i)19 JUMP_ABSOLUTE           13

As we can see, the difference is that the STORE_FAST STORE_NAME,前者比后者快很多。所以,在全局代码中,变量i成了一个全局变量,而函数中的i是放在本地变量表中,所以在全局变量表中查找变量就慢很多。如果你在main函数中声明global i 那么效率也就下来了。 local variable is present in an array (until), accessed with an integer constant, and the global variable exists in a dictionary, and the query is slow.

(注:在C + +, this is not a problem)

A third example

Why is the sequential data faster when traversing? (source StackOverflow)

See the code for C + + below:

1234567 for (unsigned i = 0; i < 100000; ++i) {     //primary loop &NBSP;&NBSP;&NBSP;&NBSP; for (unsigned j = 0; j < arraySize; ++j) {           if (Data[j] >=)               sum + = Data[j]; &NBSP;&NBSP;&NBSP;&NBSP; } }

If your data array is ordered, then performance is 1.93s, and if not sorted, the performance is 11.54 seconds. More than 5 times times worse. Either C/c++/java or any other language is basically the same.

The reason for this problem is-- branch Prediction (branch pre-award) Great StackOverflow gave a very good explanation.

Considering our railroad fork, when our train came, the Bandao knew where to split the fork, but did not know where the train was going, and the driver knew where to go, but did not know which fork to take. So, we need to stop the train, and then the driver and the Bandao to communicate. This is a poor performance.

So, we can optimize, that is to guess, we have at least 50% probability guess right, if guessed right, the train high performance, guess wrong, you have to let the train back. If I guess the probability is high, then, our performance will be high, otherwise always guess wrong, performance is very poor.

Image by Mecanismo, from Wikimedia commons:http://commons.wikimedia.org/wiki/file:entroncamento_do_transpraia.jpg

Our if-else is like this railroad fork, and the Red Arrows below refer to the moving-path device.

So, how do we pre-contract the runner? is to use past historical data, if the historical data has more than 90% of the left, then go to the left. So, it's easier to guess the right data in order.

ordered
123 4567 t = Walk Branch (conditional expression = true ) n = no branching (conditional expression false )  data[] = 0, 1, 2, 3, 4, ... 126, 127, 128, 129, 130, ... 251, 252, ... branch = n  n  n  n  n   ...   n    n    t   ;  t    t   ...   T    t    t   ....  = nnnnnnnnnnnn ... Nnnnnnnttttttttt ... TTTTTTTTTT   (easy to predict)
Unsorted
1234 data[] = 226, 185, 125, 158, 198, 144, 217, 79, 202, 118,  14, 150, 177, 182, 133, ...branch =   T,   T,   N,   T,   T,   T,   T,  N,   T,   N,   N,   T,   T,   T,   N  ...= TTNTTTTNTNNTTTN ...   (completely random - hard to predict)

From the above we can see that the sorted data is easier to predict the branch.

So what are we going to do about it? We need to remove the If-else statement in this loop. Like what:

We put the conditional statement:

12 if(data[j] >= 128)sum += data[j];

Become:

12 intt = (data[j] - 128) >> 31;sum += ~t & data[j];

The "no fork" performance is basically the same as "orderly branching", whether it is a C/s, or Java.

Note: under GCC, if you use the -O3 or -ftree-vectorize compile parameter, GCC will help you optimize the fork statement as no fork statement. VC++2010 does not have this feature.

Finally, we recommend a website--google speed, there are some tutorials on the website to show you how to write a faster Web program.

(End of full text)

Efficiency of code Execution

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.