This article mainly introduces various methods for debugging Python code in Linux. it is used for debugging after programming. if you need it, refer to this article as an overview of my debugging or analysis tools, not necessarily complete and comprehensive. if you know better tools, mark them in the comments.
Logs
Yes, indeed, we have to emphasize how important enough logging is to the application. You should record important things. if your record is good enough, you can find problems in the log to save a lot of time.
If you have used the print statement to debug the code, stop it now and use logging. debug to replace it. you can start by slowly disabling it in the future...
Tracking
Sometimes it is helpful to see how the program is executed. You can use the IDE to debug the program that runs in step-by-step mode. However, you need to know what you are looking for. Otherwise, this will be a long process.
There is a trace module in the standard library that can print all the content in the execution process (like producing a Coverage Report ).
python -mtrace --trace script.py
This will generate a large amount of output (each row is printed and output, so you 'd better use the pipeline and use grep to view only the part you are interested in), for example:
python -mtrace --trace script.py | egrep '^(mod1.py|mod2.py)'-
If you like new features, you can try smiley-it can display variable content changes, you can also use it to remotely track programs.
PDB
import pdbpdb.set_trace() # opens up pdb prompt
Or:
try: code that failsexcept: import pdb pdb.pm() # or pdb.post_mortem()
Or (press the C key to start the script ):
python -mpdb script.py
As in REPL:
- C or continue
- Q or quit
- L or list: displays the source code on the current interface.
- W or where: Display backtracing
- D or down: displays the next interface of backtracking.
- U or up, displays the last interface of backtracking
- , Repeat the last command
- Anything else, evaluate the source code on the current interface (t there are other commands)
- Corcontinue
- Qorquit
- Lorlist, the source of the current frame.
- Worwhere: Display backtracing
- Dordown, 1 frame backtracking
- Uorup, up 1 frame backtracking
- Press enter to repeat the last command.
Evaluate the Python code of the current frame (there are several other commands) for almost everything)
Can replace pdb:
- Ipdb (easy_install ipdb)-like ipython (auto-completion, color, etc)
- Pudb (easy_install pudb)-based on curses (class gui), browsing the source code has a good performance.
Remote PDB
sudo apt-get install winpdb
Replace pdb. set_trace ():
import rpdb2rpdb2.start_embedded_debugger("secretpassword")
Run Winpdb and enter the password to File> Attach.
Do not like Winpdb? Run PDB over TCP
Use the following code:
import loggging class Rdb(pdb.Pdb): """ This will run pdb as a ephemeral telnet service. Once you connect no one else can connect. On construction this object will block execution till a client has connected. Based on https://github.com/tamentis/rpdb I think ... To use this:: Rdb(4444).set_trace() Then run: telnet 127.0.0.1 4444 """ def __init__(self, port=0): self.old_stdout = sys.stdout self.old_stdin = sys.stdin self.listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) self.listen_socket.bind(('0.0.0.0', port)) if not port: logging.critical("PDB remote session open on: %s", self.listen_socket.getsockname()) print >> sys.__stderr__, "PDB remote session open on:", self.listen_socket.getsockname() sys.stderr.flush() self.listen_socket.listen(1) self.connected_socket, address = self.listen_socket.accept() self.handle = self.connected_socket.makefile('rw') pdb.Pdb.__init__(self, completekey='tab', stdin=self.handle, stdout=self.handle) sys.stdout = sys.stdin = self.handle def do_continue(self, arg): sys.stdout = self.old_stdout sys.stdin = self.old_stdin self.handle.close() self.connected_socket.close() self.listen_socket.close() self.set_continue() return 1 do_c = do_cont = do_continue def set_trace(): """ Opens a remote PDB on first available port. """ rdb = Rdb() rdb.set_trace()
Want REPL? How about IPython?
If you do not need a complete debugger, you only need to start IPython and use the following code:
import IPythonIPython.embed()
Standard Linux tools
I was surprised that they were not fully utilized. With these toolsets, you can find out many problems such as performance problems (too many system calls, memory allocation, etc.) to deadlocks, networks, disks, and so on.
sudo apt-get install htopsudo htop
The most useful thing is to run strace with downgrading. you only need to run the quick-frozen strace-P 12345 or strace-f command parameter (-f indicates the strace branch process ). There are usually a lot of outputs. you 'd better redirect the output to a file (Add &> file name after the command) for more in-depth analysis.
Then there is ltrace, which is similar to strace but called through the database, and the parameters are basically the same.
Lsof can provide the processing number of the ltrace/strace you have read, so that you can use: lsof-P 12345
Deeper tracking
It is easy to use and can do many things, provided that htop has been installed!
Now, to find the process you want, simply follow the steps below:
- S Display system call tracking (strace)
- L ltrace)
- L display lsof
Monitoring
There is no better alternative, and the server is continuously monitored. Have you ever found yourself using strange tracking methods to find out why the system is slow and how the resources are consumed? so don't be replaced by iotop, iftop, htop, iostat, vmstat, and so on are annoying. use dstat quickly. it can be done by most of the tools mentioned above and can be done better!
It will color your data with compact and fashionable code (like iostat, vmstat yo), and you can always see the previous data (with iftop, iotop, htop is different ).
Just run this:
dstat --cpu --io --mem --net --load --fs --vm --disk-util --disk-tps --freespace --swap --top-io --top-bio-adv
Another point is that there is a simpler way to write, such as shell history or the rename command (aliases)
GDB
This is a very complex and powerful tool, but I only involve basic things (setup and basic commands ).
sudo apt-get install gdb python-dbgzcat /usr/share/doc/python2.7/gdbinit.gz > ~/.gdbinitrun app with python2.7-dbgsudo gdb -p 12345
Use:
Bt-stack track (Level C)
Pystack-python stack trajectory, provided that you have ~ /. Gdbinit and use python-dbg
C (continue)
Does segfaults appear? Use faulthandler!
Except for Python 3.3, this terrible error is returned to Python 2.x.
As long as you follow the steps below, you will find at least one cause of the segment error.
>>> import faulthandler>>> faulthandler.enable()
Memory leakage
Well, there are many tools here, some of which are specifically used for WSGI applications, such as Dozer, but what I like most is objgraph. It is so incredibly convenient and easy to use. It is not inherited from WSGI or anything else, so you need to find your own way to run the following code:
>>> import objgraph>>> objs = objgraph.by_type("Request")[:15]>>> objgraph.show_backrefs(objs, max_depth=20, highlight=lambda v: v in objs, filename="/tmp/graph.png")Graph written to /tmp/objgraph-zbdM4z.dot (107 nodes)Image generated as /tmp/graph.png
You will get a chart like this (Warning: This chart is very large ). You will also get the dot output.
Memory utilization
Sometimes you want to use less memory. Low memory allocation usually makes the program run faster and better, and users like to keep improving :)
There are many tools that can be used to use [1], but in my opinion, the best is pytracemalloc-compared with other tools, it has a low overhead (it does not need to rely on the speed of sys. settrace) and its output is very detailed. The headache is its configuration, because you need to re-compile python, but spt makes it easy to achieve.
Run the following command to buy lunch or do other things:
apt-get source python2.7 cd python2.7-*wget https://github.com/wyplay/pytracemalloc/raw/master/python2.7_track_free_list.patchpatch -p1 < python2.7_track_free_list.patchdebuild -us -uc cd ..sudo dpkg -i python2.7-minimal_2.7*.deb python2.7-dev_*.deb
Then install pytracemalloc (note: If you are doing these operations in a virtual environment, you need to re-install it in python-only run virtualenv myenv ):
pip install pytracemalloc
Now, you can encapsulate your application using the following code:
import tracemalloc, timetracemalloc.enable()top = tracemalloc.DisplayTop( 5000, # log the top 5000 locations file=open('/tmp/memory-profile-%s' % time.time(), "w"))top.show_lineno = Truetry: # code that needs to be tracedfinally: top.display()
The output is as follows:
2013-05-31 18:05:07: Top 5000 allocations per file and line#1: .../site-packages/billiard/_connection.py:198: size=1288 KiB, count=70 (+0), average=18 KiB#2: .../site-packages/billiard/_connection.py:199: size=1288 KiB, count=70 (+0), average=18 KiB#3: .../python2.7/importlib/__init__.py:37: size=459 KiB, count=5958 (+0), average=78 B#4: .../site-packages/amqp/transport.py:232: size=217 KiB, count=6960 (+0), average=32 B#5: .../site-packages/amqp/transport.py:231: size=206 KiB, count=8798 (+0), average=24 B#6: .../site-packages/amqp/serialization.py:210: size=199 KiB, count=822 (+0), average=248 B#7: .../lib/python2.7/socket.py:224: size=179 KiB, count=5947 (+0), average=30 B#8: .../celery/utils/term.py:89: size=172 KiB, count=1953 (+0), average=90 B#9: .../site-packages/kombu/connection.py:281: size=153 KiB, count=2400 (+0), average=65 B#10: .../site-packages/amqp/serialization.py:462: size=147 KiB, count=4704 (+0), average=32 B ...