Before starting the optimization, write an advanced test to prove that the original code is slow. You may need to use some minimum data sets to reproduce it slowly enough. Usually one or two programs that show runtime seconds are enough to handle some improvements.
There are some basic tests to ensure that your optimization does not change the behavior of the original code is also necessary. You can also slightly modify the benchmarks for these tests when you run the tests many times to optimize the code.
So now, let's take a look at the optimizer tool.
A simple timer
The timer is simple, which is the most flexible way to record execution time. You can put it anywhere and the side effects are small. It's easy to run your own timer, and you can customize it to work the way you want it to. For example, you have a simple timer like this:
Import time
Def timefunc (f):
def f_timer (*args, **kwargs):
Start = Time.time () result = f (*args, **kwargs)
End = Time.time () print f.__name__, ' took ', End-start, ' time ' return result return F_timer def get_number (): For X in Xrange (5000000): Yield x @timefuncdef expensive_function (): For x in Get_number (): i = x ^ x ^ x return ' so Me result! ' # prints "Expensive_function took 0.72583088875 seconds" result = Expensive_function ()
Of course, you can use context management to make it more powerful, to add checkpoints or some other functionality:
import time class timewith (): def __init__ (Self, name= "): Self.name = name Self.st Art = Time.time () @property def elapsed (self): return Time.time ()-Self.start DEF CHEC Kpoint (self, name= "): print ' {timer} {checkpoint} took {elapsed} seconds '. Format ( timer=self.name, & nbsp checkpoint=name, elapsed=self.elapsed, ). Strip () def __enter__ (self): Return self def __exit__ (self, type, value, Traceback): Self.checkpoint (' finished ') pass Def get_number (): for x in Xrange (5000000): yield x def expensive_function (): for x in GE T_number (): i = x ^ x ^ x return ' Some result! ' # prints something like: # Fancy thing do with Somet Hing took 0.582462072372 seconds # fancy thing done with something else took 1.75355315208 seconds # fancy Thing finished Took 1.7535982132 secondS with Timewith (' fancy thing ') as timer: expensive_function () timer.checkpoint (' Do with something ') Expensive_function () expensive_function () timer.checkpoint (' Do with something Else ') # or Directly timer = Timewith (' Fancy Thing ') expensive_function () timer.checkpoint (' Done with something ')
Timers also require you to do some digging. Wrap some of the more advanced functions and determine where the bottlenecks are, and then drill down into the functions to reproduce them continuously. When you find some inappropriate code, fix it, and then test it again to make sure it's fixed.
Some tips: Don't forget the easy-to-use Timeit module! It is more useful to benchmark small pieces of code than to actually investigate.
Timer Pros: It's easy to understand and implement. It is also very easy to compare after modifications. This is true for many languages.
Timer Cons: Sometimes it's a little too simple for very complex code, and you might spend more time placing or moving reference code instead of fixing the problem!
Built-in Optimizer
Enabling the built-in optimizer is like using a cannon. It's very powerful, but it's a bit less useful, and it's more complicated to use and interpret.
You can learn more about the profile module, but its foundation is very simple: You can enable and disable the optimizer, and it can print all function calls and execution times. It gives you the ability to compile and print out the output. A simple adorner is as follows:
Import CProfile def do_cprofile (func): Def profiled_func (*args, **kwargs): Profile = Cprofile.profile () try:pr Ofile.enable () result = Func (*args, **kwargs) profile.disable () return result Finally:profile.print_stats () Return Profiled_func def get_number (): For x in Xrange (5000000): Yield x @do_cprofiledef expensive_function (): For x in Get_number (): i = x ^ x ^ x return ' some result! ' # perform profiling result = Expensive_function ()
In the case of the above code, you should see something printed at the terminal, printed as follows:
5000003 function calls in 1.626 seconds Ordered by:standard name ncalls tottime percall cumtime percall filename: Lineno (function) 5000001 0.571 0.000 0.571 0.000 timers.py:92 (get_number) 1 1.055 1.055 1.626 1.626 timers.py:96 (expens ive_function) 1 0.000 0.000 0.000 0.000 {method ' disable ' of ' _lsprof. Profiler ' Objects}
As you can see, it gives the number of calls to different functions, but it misses some key information: Which function makes the run so slow?
However, this is a good start for basic optimization. Sometimes you can even find a solution with less effort. I often use it to drill down into which function is slow or too many times to come to the debugger.
Built-in advantages: No extra dependencies and very fast. Very useful for fast high-level checks.
Built-in disadvantages: The information is relatively limited, need further debugging; The report is a bit less straightforward, especially for complex code.
Line Profiler
If the built-in optimizer is a cannon, the line Profiler can be seen as an ion cannon. It's very heavyweight and powerful.
In this example, we will use a very good line_profiler library. For ease of use, we'll wrap it again with an adorner, and this simple method can also prevent it from being placed in production code.
Try: from line_profiler import lineprofiler def do_profile (follow=[]): def Inner (func): def Profiled_func (*args, **kwargs): Try: profiler = Lineprofiler () & Nbsp;profiler.add_function (func) for F in follow: profiler.add_function (f)   ; profiler.enable_by_count () return func (*args, **kwargs) finally: &N Bsp profiler.print_stats () return Profiled_func return inner except Importerror: & Nbsp;def Do_profile (follow=[]): "Helpful if you accidentally leave in production!" def Inner (func): &N Bsp;def Nothing (*args, **kwargs): return func (*args, **kwargs) return nothing return Inn ER def get_number (): for x in Xrange (5000000): yield x @do_profile (follow=[get_numb ER]) def expensive_functIon (): for x in Get_number (): i = x ^ x ^ x return ' Some result! ' result = Expensive_function () & nbsp
If you run the above code, you can see the report:
Timer unit:1e-06 s File:test.py function:get_number at line 43Total time:4.44195 S Line # Hits time per hit% Time Line Contents ============================================================== def get_number (): 44 5000 001 2223313 0.4 50.1 for x in Xrange (5000000): 5000000 2218638 0.4 49.9 yield x File:test.py Function:ex Pensive_function at line 47Total time:16.828 S Line # Hits time per hits% time line Contents ====================== ======================================== def expensive_function (): 5000001 14090530 2.8 83.7 for X in Get_number (): 5000000 2737480 0.5 16.3 i = x ^ x ^ x 1 0 0.0 0.0 return ' some result! '
As you can see, there is a very detailed report that gives you complete insight into how the code is running. Rather than built-in Cprofiler, it calculates the time of the language's core features, such as looping and importing, and giving time spent on different lines.
These details make it easier to understand the inside of a function. If you are studying a third-party library, you can import it directly and add an adorner to analyze it.
Some tips: Just decorate your test function and use the problem function as the next parameter.
Line Profiler Pros: There are very direct and detailed reports. Ability to track functions in third-party libraries.
Line Profiler Cons: Because it makes the code much slower than it really is, do not use it for benchmarking purposes. This is an additional requirement.
Summary and Best practices
You should use simpler tools to make a fundamental check of test cases and drill down into the inside of the function with Line_profiler that are slower but show more detail.
In nine cases, you may find that looping through a function or a wrong data structure consumes 90% of the time. Some adjustment tools are perfect for you.
If you still feel this is too slow, use some of your own secret weapons, such as comparing attribute access techniques or adjusting the balancing check technique. You can also use the following methods:
1. Endure slow or cache them
2. Rethinking the entire implementation
3. More data structures optimized for use
4. Write a C extension
Note that optimizing code is a sin of pleasure! It's fun to accelerate your Python code in the right way, but be careful not to break your own logic. Readable code is more important than running speed. It is better to cache it first and then optimize it.
This article is from the "yishengayou" blog, make sure to keep this source http://10078369.blog.51cto.com/10068369/1627892
Optimize Python execution efficiency