A rough analysis of memory leaks in Python _python

Source: Internet
Author: User

Introduction

I've been blind to the idea that Python will never have a memory leak, but seeing the online project growing memory footprint as the running time grows, I realized that my program was leaking memory before the logging module had been debug.

At present, there are other places caused by memory leaks. After a day of fighting, finally found the memory leak place, the current project run for a long time, in a small amount of business memory can return to just start when the memory footprint.
What's the situation without this trouble?

If your program just runs and quits, you don't have to go out of your way to find out if there is a memory leak, because Python frees up all of the memory it allocates when it exits, and if your program needs to run for a long time, look carefully for a memory leak.
Scene

How do I generate a memory leak? The project is a TCP server that creates a connection instance whenever there is a connection to manage, and the connection instance is occupied and not released each time it is disconnected. The reason for not being freed is certainly because there is a place where the reference to the connection instance is not freed, so over time, the connection creates the allocated memory, the connection disconnects and does not release the memory, so a memory leak is generated.
Debugging Methods

Because you do not know where the specific cause of memory leaks, so be patient with a little debugging.

Because I know there is no release from the disconnect, I am constantly emulating the creation of the connection and then disconnecting the packets after sending them, and then using the following line of shells to observe the memory footprint:

Pid=50662;while true; Do PS aux | grep $PID | Grep-v grep | awk ' {print $ ' $} ' >> t; Sleep 1; Done

If you keep living after a certain amount of growth, it means that there is no leakage.

You can also view the object's reference count at the time the object is released, through Sys.getrefcount (obj). If the reference count changes to 2, the object is correctly recycled after it has jumped out of the namespace.
cause

Two situations in a project cause the object not to be properly reclaimed:

    • Object references that are reclaimed before they are withdrawn
    • Cross Reference

Object references that are reclaimed before they are withdrawn

To track the connection, the connection object is placed in a list, and the list is recycled only when the program exits, and if it is not handled correctly, the assigned object will be recycled only when the program exits.

Global variables and class variables are only reclaimed when the program exits:

_connections = []

#
... Class Connection (object):
 def __init__ (self, sock, address)
  pass

def server_loop ():
 #
 ... Sock, address = server_sock.accept ()
 connection = connection (sock, address)
 _connections.append (connection )
 # ...
 Sock.close ()

All the established connections are placed in the global variable _connections, and if the connection object is not reclaimed if it is not removed from the list (less references) when it is closed, then each connection object and the object referenced by the connection object will not be reclaimed.

It's also the same if you put objects in a class attribute, because class objects are assigned at the beginning of the program and are recycled when the program exits.

The workaround is to disassociate the object from the list (or other objects) when exiting (delete)

_connections = []

#
... Class Connection (object):
 def __init__ (self, sock, address)
  pass

def server_loop ():
 #
 ... Sock, address = server_sock.accept ()
 connection = connection (sock, address)
 _connections.append (connection)
 try:
  #
  ... Sock.close ()
 finally:
  _connections.remove (connection) # XXX

Cross Reference

Sometimes when we assign an instance property to an object, we need to assign ourselves to the instance property as an instance property of the instance property, which is very awkward, look at the code:

Class Connectionhandler (object):
 def __init__ (self, Connection):
  Self._conn = Connection


class Connection (object):
 def __init__ (self, sock, address)
  Self._conn_handler = Connectionhandler (self) # XXX

The code above will generate cross-references, and cross-references can confuse the interpreter, which can only be recycled in 2 and 3 generations, and this process can be slow.

The way to solve this problem is to use weak references

Import Weakref

class Connectionhandler (object):
 def __init__ (self, Connection):
  Self._conn = connection


class Connection (object):
 def __init__ (self, sock, address)
  Self._conn_handler = Connectionhandler ( Weakref.proxy (self)) # XXX

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.