Rough analysis of memory leakage in Python

Source: Internet
Author: User
This article mainly introduces a rough analysis of memory leakage in Python, and analyzes the causes including garbage collection. For more information, see Introduction

I used to blindly think that Python would not have memory leakage, but I was aware that the program I wrote was in memory leakage due to the increasing memory usage of online projects as the running time grew, memory leakage caused by logging module debug.

At present, there are other causes of memory leakage. after a day of hard work, I finally found the memory leak. Currently, the project has been running for a long time. when the business volume is small, the memory can still return to the memory usage at the startup.
Under what circumstances do you not need to be so troublesome?

If your program just runs and exits, you don't have to pay a lot of weeks to find out whether there is memory leakage, because Python releases all the memory allocated by it when it exits, if your program needs to run continuously for a long time, you need to carefully check whether memory leakage has occurred.
Scenario

How does one cause memory leakage? the project is a TCP server, which creates a connection instance for management whenever there is a connection, and the connection instance is still occupied and not released at each disconnection. the reason why the instance is not released is certainly because the reference to the connected instance is not released somewhere. so over time, the connection creation allocates memory, and the disconnection does not release the memory, therefore, memory leakage occurs.
Debugging method

Because I don't know where the memory is leaked, I need to debug it with patience.

As I know that the connection is not released when the connection is disconnected, I will simulate the creation of the connection and then send some packages to disconnect the connection. then, I will observe the memory usage through the following shell line:

PID = 50662; while true; do; ps aux | grep $ PID | grep-v grep | awk '{print $5 "" $6}'> t; sleep 1; done

If you keep it after a certain amount of growth, it means there has been no leakage.

You can also view the reference count of the object when the object is released. getrefcount (obj ). if the reference count is changed to 2, the object will be correctly recycled after it jumps out of the namespace.
Cause

In two cases in the project, the object is not properly recycled:

  • Object reference that is recycled only after exiting
  • Cross-reference

Object reference that is recycled only after exiting

In order to track connections, the connection object is put in a list at the same time, and this list will be recycled only when the program Exits. if it is not handled correctly, then the allocated object will be recycled only when the program exits.

Global variables and class variables are recycled only when the program exits:

_CONNECTIONS = []# ...class Connection(object): def __init__(self, sock, address)  passdef server_loop(): # ... sock, address = server_sock.accept() connection = Connection(sock, address) _CONNECTIONS.append(connection) # ... sock.close()

All established CONNECTIONS are placed in the global variable _ CONNECTIONS. if the connection object is not retrieved from the list when it is disabled (reference is reduced), the connection object will not be recycled, each time a connection is established, a connection object and the object referenced by the connection object will not be recycled.

It is the same if you put the object in a class attribute, because the class object is allocated at the beginning of the program and is recycled only when the program exits.

The solution is to remove the reference (delete) to the object from the list (or other objects) at exit)

_CONNECTIONS = []# ...class Connection(object): def __init__(self, sock, address)  passdef server_loop(): # ... sock, address = server_sock.accept() connection = Connection(sock, address) _CONNECTIONS.append(connection) try:  # ...  sock.close() finally:  _CONNECTIONS.remove(connection) # XXX

Cross-reference

Sometimes, when we assign an instance attribute to an object, we need to assign the instance attribute to the instance attribute. as an instance attribute of the instance attribute, we can just look at the code:

class ConnectionHandler(object): def __init__(self, connection):  self._conn = connectionclass Connection(object): def __init__(self, sock, address)  self._conn_handler = ConnectionHandler(self) # XXX

The above code will generate a cross reference, which will confuse the interpreter, so that it can only be recycled by the second and third generations. this process may be slow.

To solve this problem, use weak references.

import weakrefclass ConnectionHandler(object): def __init__(self, connection):  self._conn = connectionclass Connection(object): def __init__(self, sock, address)  self._conn_handler = ConnectionHandler(weakref.proxy(self)) # XXX

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.