Rough analysis of memory leaks in Python

Source: Internet
Author: User
Intro

I've been blindly thinking that Python will not have a memory leak, but seeing that the online project grows in memory with the increase in uptime, I realized that the program I wrote was in memory leaks, before the memory leaks caused by the debug logging module.

At present, there are other places that cause memory leaks. After a day of fighting, and finally found the memory leaks, the project has been running for a long time, in the small amount of business when the memory can be back to just start when the memory consumption.
What's the situation without such trouble?

If your program just runs, it's not going to take a lot of trouble to find out if there is a memory leak, because Python frees all the memory it allocates when it exits, and if your program needs to run for a long time, look carefully for a memory leak.
Scene

How to generate a memory leak? The project is a TCP server that creates a connection instance to manage every time a connection is made, and the connection instance is not released at each disconnection. The reason for not being released must be that there is somewhere where the reference to the connection instance is not released, so over time, the connection creates the allocated memory, the connection disconnects and the memory is not freed, so a memory leak occurs.
Debugging methods

Because do not know where the specific is caused by the memory leak, so be patient of a little debugging.

Knowing that the disconnection was not released, I kept simulating creating the connection and then sending some packets and then disconnecting, then observing the memory footprint through the following line of shells:

Pid=50662;while true; Do PS aux | grep $PID | Grep-v grep | awk ' {print $ ' "$6} ' >> t; Sleep 1; Done

If it is maintained after a certain amount of growth, there is no leakage.

You can also view the object's reference count when the object is freed by Sys.getrefcount (obj). If the reference count becomes 2, it means that the object is reclaimed correctly after it jumps out of the namespace.
Cause

Two scenarios in the project cause the object to not be properly reclaimed:

    • Object references that were exited before being recycled
    • Cross-references

Object references that were exited before being recycled

To track the connection, the connection object is placed in a list at the same time, and the list is recycled only when the program exits, and if it is not handled correctly, the allocated object will be recycled only when the program exits.

Global variables and class variables are only recycled when the program exits:

_connections = []# ... class Connection (object): Def __init__ (self, sock, address)  passdef server_loop (): # ... sock, a ddress = server_sock.accept () connection = connection (sock, address) _connections.append (connection) # ... sock.close ()

All of the established connections are placed in the global variable _connections, and if you do not remove them from the list (by reducing the references) when you close them, the connection object will not be recycled, and the objects referenced by the connection object and the connection object are not recycled each time you establish a connection.

It is also the same if you put an object in a class property, because the class object is allocated at the beginning of the program and is recycled when the program exits.

The workaround is to dismiss the object (delete) from the list (or other object) when exiting

_connections = []# ... class Connection (object): Def __init__ (self, sock, address)  passdef server_loop (): # ... sock, a ddress = server_sock.accept () connection = connection (sock, address) _connections.append (connection) Try:  # ...  Sock.close () Finally:  _connections.remove (connection) # XXX

Cross-references

Sometimes when we assign an instance property to an object, we need to assign ourselves to an instance property as an instance property of an instance property, which is very awkward, and look at the code:

Class Connectionhandler (object): Def __init__ (self, Connection):  self._conn = Connectionclass Connection (object): def __init__ (self, sock, address)  Self._conn_handler = Connectionhandler (self) # XXX

The above code will produce cross-references, and cross-references will confuse the interpreter, which can then be recycled from Generation 2 and Generation 3, which can be slow.

The way to solve this problem is to use weak references

Import Weakrefclass Connectionhandler (object): Def __init__ (self, Connection):  self._conn = Connectionclass Connection (object): Def __init__ (self, sock, address)  Self._conn_handler = Connectionhandler (Weakref.proxy (self) ) # XXX
  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.