一、python有自動記憶體回收機制(當對象的引用計數為零時解譯器會自動釋放記憶體),出現記憶體泄露的情境一般是擴充庫記憶體泄露或者循環參考(還有一種是全域容器裡的對象沒有刪除)
前者無需討論,後者舉例如下(Obj('B')和Obj('C')的記憶體沒有回收)
[dongsong@localhost python_study]$ cat leak_test2.py #encoding=utf-8class Obj: def __init__(self,name='A'): self.name = name print '%s inited' % self.name def __del__(self): print '%s deleted' % self.nameif __name__ == '__main__': a = Obj('A') b = Obj('B') c = Obj('c') c.attrObj = b b.attrObj = c[dongsong@localhost python_study]$ vpython leak_test2.py A initedB initedc initedA deleted
二、objgraph模組
該模組可以找到增長最快的對象、實際最多的對象,可以畫出某對象裡面所有元素的參考關聯性圖、某對象背後的所有參考關聯性圖;可以根據地址擷取對象
但是用它來找記憶體泄露還是有點大海撈針的感覺:需要自己更具增長最快、實際最多個物件的日誌來確定可疑對象(一般是list/dict/tuple等common對象,這個很難排查;如果最多最快的是自訂的非常規對象則比較好確定原因)
1.show_refs() show_backrefs() show_most_common_types() show_growth()
[dongsong@localhost python_study]$ !catcat objgraph1.py #encoding=utf-8import objgraphif __name__ == '__main__': x = [] y = [x, [x], dict(x=x)] objgraph.show_refs([y], filename='/tmp/sample-graph.png') #把[y]裡面所有對象的引用畫出來 objgraph.show_backrefs([x], filename='/tmp/sample-backref-graph.png') #把對x對象的引用全部畫出來 #objgraph.show_most_common_types() #所有常用類型對象的統計,資料量太大,意義不大 objgraph.show_growth(limit=4) #列印從程式開始或者上次show_growth到現在增加的對象(按照增加量的大小排序)[dongsong@localhost python_study]$ !vpythonvpython objgraph1.py Graph written to /tmp/tmpuSFr9A.dot (5 nodes)Image generated as /tmp/sample-graph.pngGraph written to /tmp/tmpAn6niV.dot (7 nodes)Image generated as /tmp/sample-backref-graph.pngtuple 3393 +3393wrapper_descriptor 945 +945function 830 +830builtin_function_or_method 622 +622
sample-graph.png
sample-backref-graph.png
2.show_chain()
[dongsong@localhost python_study]$ cat objgraph2.py #encoding=utf-8import objgraph, inspect, randomclass MyBigFatObject(object): passdef computate_something(_cache = {}): _cache[42] = dict(foo=MyBigFatObject(),bar=MyBigFatObject()) x = MyBigFatObject()if __name__ == '__main__': objgraph.show_growth(limit=3) computate_something() objgraph.show_growth(limit=3) objgraph.show_chain( objgraph.find_backref_chain(random.choice(objgraph.by_type('MyBigFatObject')), inspect.ismodule), filename = '/tmp/chain.png') #roots = objgraph.get_leaking_objects() #print 'len(roots)=%d' % len(roots) #objgraph.show_most_common_types(objects = roots) #objgraph.show_refs(roots[:3], refcounts=True, filename='/tmp/roots.png')[dongsong@localhost python_study]$ !vpythonvpython objgraph2.py tuple 3400 +3400wrapper_descriptor 945 +945function 831 +831wrapper_descriptor 956 +11tuple 3406 +6member_descriptor 165 +4Graph written to /tmp/tmpklkHqC.dot (7 nodes)Image generated as /tmp/chain.png
chain.png
三、gc模組
該模組可以確定記憶體回收期無法引用到(unreachable)和無法釋放(uncollectable)的對象,跟objgraph相比有其獨到之處
gc.collect()強制回收垃圾,返回unreachable object的數量
gc.garbage返回unreachable object中uncollectable object的列表(都是些有__del__()解構函式並且身陷引用迴圈的對象)IfDEBUG_SAVEALL
is set, then all unreachable objects will be added to this list rather than freed.
warning:如果用gc.disable()把自動記憶體回收關掉了,然後又不主動gc.collect(),你會看到記憶體刷刷的被消耗....
[dongsong@bogon python_study]$ cat gc_test.py #encoding=utf-8import gcclass MyObj: def __init__(self, name): self.name = name print "%s inited" % self.name def __del__(self): print "%s deleted" % self.nameif __name__ == '__main__': gc.disable() gc.set_debug(gc.DEBUG_COLLECTABLE | gc.DEBUG_UNCOLLECTABLE | gc.DEBUG_INSTANCES | gc.DEBUG_OBJECTS | gc.DEBUG_SAVEALL) a = MyObj('a') b = MyObj('b') c = MyObj('c') a.attr = b b.attr = a a = None b = None c = None if gc.isenabled(): print 'automatic collection is enabled' else: print 'automatic collection is disabled' rt = gc.collect() print "%d unreachable" % rt garbages = gc.garbage print "\n%d garbages:" % len(garbages) for garbage in garbages: if isinstance(garbage, MyObj): print "obj-->%s name-->%s attrrMyObj-->%s" % (garbage, garbage.name, garbage.attr) else: print str(garbage)[dongsong@bogon python_study]$ vpython gc_test.py a initedb initedc initedc deletedautomatic collection is disabledgc: uncollectable <MyObj instance at 0x7f3ebd455b48>gc: uncollectable <MyObj instance at 0x7f3ebd455b90>gc: uncollectable <dict 0x261c4b0>gc: uncollectable <dict 0x261bdf0>4 unreachable4 garbages:obj--><__main__.MyObj instance at 0x7f3ebd455b48> name-->a attrrMyObj--><__main__.MyObj instance at 0x7f3ebd455b90>obj--><__main__.MyObj instance at 0x7f3ebd455b90> name-->b attrrMyObj--><__main__.MyObj instance at 0x7f3ebd455b48>{'name': 'a', 'attr': <__main__.MyObj instance at 0x7f3ebd455b90>}{'name': 'b', 'attr': <__main__.MyObj instance at 0x7f3ebd455b48>}
四、pdb模組
詳細手冊:http://www.ibm.com/developerworks/cn/linux/l-cn-pythondebugger/
命令和gdb差不錯(只是列印資料的時候不是必須加個p,而且調試介面和操作類似python互動模式)
h(elp) 協助
c(ontinue) 繼續
n(ext) 下一個語句
s(tep) 下一步(跟進函數內部)
b(reak) 設定斷點
l(ist) 顯示代碼
bt 調用棧
斷行符號 重複上一個命令
....
鳥人喜歡在需要調試的地方加入pdb.set_trace()然後進入狀態....(其他還有好多方式備選)
五、django記憶體泄露
Why is Django leaking memory?
Django isn't known to leak memory. If you find your Django processes areallocating more and more memory, with no sign of releasing it, check to makesure yourDEBUG
setting is set toFalse. IfDEBUGisTrue,
then Django saves a copy of every SQL statement it has executed.
(The queries are saved in django.db.connection.queries. SeeHow
can I see the raw SQL queries Django is running?.)
To fix the problem, set
DEBUG toFalse.
If you need to clear the query list manually at any point in your functions,just callreset_queries(), like this:
from django import dbdb.reset_queries()