Theano Tutorial: Python memory management, theanopython

Source: Internet
Author: User
Tags theano

Theano Tutorial: Python memory management, theanopython

A major challenge in writing large programs is how to ensure the minimum memory usage. However, memory management in Python is relatively simple. Python displays the allocated memory. The reference counting system is used to manage objects. When the number of references pointing to an object changes to 0, the memory occupied by the object will be released. Theoretically, it sounds good and simple, but in practice, we need to know some knowledge about Python memory management so that the program can use the memory more efficiently during the running process. In one aspect, we need to know the size of the space occupied by basic Python objects. In addition, we need to know how Python manages memory internally.

Basic Object

How much space does an int object occupy? C/C ++ programmers will say that it is determined by a specific machine, which may be 32 or 64-bit. Therefore, it occupies a maximum of 8 bytes (eight bytes per byte ). So is it true in Python?

Next, write a function to reveal the space occupied by objects (recursion is required in some cases. For example, an object type is not a basic data type ):

 1 import sys 2  3 def show_sizeof(x, level=0): 4  5     print "\t" * level, x.__class__, sys.getsizeof(x), x 6  7     if hasattr(x, '__iter__'): 8         if hasattr(x, 'items'): 9             for xx in x.items():10                 show_sizeof(xx, level + 1)11         else:12             for xx in x:13                 show_sizeof(xx, level + 1)

 

We can use the following function call to observe the space occupied by different basic data types:

show_sizeof(None)show_sizeof(3)show_sizeof(2**63)show_sizeof(102947298469128649161972364837164)show_sizeof(918659326943756134897561304875610348756384756193485761304875613948576297485698417)

 

Result of running on 64-bit and 2.7.8 Python:

 <type 'NoneType'> 16 None <type 'int'> 24 3 <type 'long'> 36 9223372036854775808 <type 'long'> 40 102947298469128649161972364837164 <type 'long'> 60 918659326943756134897561304875610348756384756193485761304875613948576297485698417

 

We can see that None occupies 16 bytes, and int occupies 24 bytes. 64 represents three times the int64_t of C in the system and is an integer that can be recognized by machines. A long integer (unrestricted precision) is used to indicate an integer greater than 263-1. The minimum space occupied is 36 bytes. In addition, the size of the occupied space will linearly increase with the size of integers in the algorithm.

Python float is implemented in a specific way and looks similar to double in C, but the float in Python does not terminate when the data exceeds 8 Bytes:

show_sizeof(3.14159265358979323846264338327950288)

 

Output in 64:

<type 'float'> 24 3.14159265359

 

We can see that it is three times the space occupied by the double type (8 bytes) in C.

What about strings?

show_sizeof("")show_sizeof("My hovercraft is full of eels")

 

Output in 64-bit system:

 <type 'str'> 33  <type 'str'> 62 My hovercraft is full of eels

 

The Null String occupies 33 bytes. As the content of the string increases, the occupied space increases linearly.

 

The following test shows the space occupied by common tuple, list, and dictionary instances (all of which are input results in 64 system ):

show_sizeof([])show_sizeof([4, "toaster", 230.1])

 

Output:

 <type 'list'> 64 [] <type 'list'> 88 [4, 'toaster', 230.1]

 

The empty list occupies 64 bytes, while the C ++ std: list () in the 64-bit system only occupies 16 bytes, which is 4 times the total.

What about tuple? Dictionary? :

show_sizeof({})show_sizeof({'a':213, 'b':2131})

 

Output:

 <type 'dict'> 272 {} <type 'dict'> 272 {'a': 213, 'b': 2131}    <type 'tuple'> 64 ('a', 213)        <type 'str'> 34 a        <type 'int'> 24 213    <type 'tuple'> 64 ('b', 2131)        <type 'str'> 34 b        <type 'int'> 24 2131

 

It can be seen that each key/value pair in the dictionary occupies 64 bytes, but note that ('A', 213) occupies 64 bytes of space, the space occupied by 'A' is 34 bytes, and the space occupied by 213 is 24 bytes. Therefore, leave 64-(34 + 24) = 6 bytes to the key/value itself. In addition, we can see that the entire dictionary occupies 272 bytes instead of 64 + 64 = 128 bytes. The dictionary itself is designed as a data structure with high search efficiency, so it uses the necessary extra space. If the dictionary uses a tree structure, you must consider the space consumption of the nodes containing each value and the two pointers pointing to the child node. If the dictionary uses a hash table internally, we must ensure sufficient free space to ensure performance.

The dictionary is equivalent to the C ++ std: map structure, while the C ++ map occupies 48 bytes during creation (empty map), and the C ++ Null String occupies 8 bytes, the integer occupies 4 bytes.

Why have we observed so many phenomena? It seems that an empty string occupies 8 bytes or 37 bytes. This is true if the data size is not extended. We must be concerned about how many objects we create will reach the memory limit used by the program. In practice, this problem is very tricky. To design a good memory management policy, you must not only care about the size of the memory occupied by objects, but also the number of created objects and the order in which these objects are created, it turns out that this is very important for Python. A key element is to understand how Python allocates memory internally, which will be discussed below.

Internal memory management

To accelerate memory allocation (and reuse), Python manages small objects in a list. The size of objects in each list is very similar: for example, an object in a list occupies 1 to 8 bytes, and an object in another list occupies 9 to 16 bytes. When you need to create a small object, you must either reuse the free block in the list or allocate a new space.

In fact, even if the space of an object is free, the occupied memory space will not be returned to the global memory pool of Python, but will be marked as free and then added to the idle list. The location space of an expired (extinct) object will be reused when a new object of almost the same size is created. If no expired object exists, then the new space is allocated.

If the memory occupied by small objects is never released, the memory occupied by the list will increase continuously, and the memory will gradually be occupied by these large numbers of small objects.

Therefore, we should try to allocate only space to those necessary objects, create only a small number of objects in the loop, and try to use the generator syntax.

In fact, the free growth of list space occupation does not seem to be a problem. I think the space contained in the list still runs Python programs to enter and use. However, from the perspective of the operating system, the memory occupied by the program will exceed the total memory allocated to Python by the system.

To prove the above, use memory_profiler (dependent onPython-psutil package) To prove:

1 import copy 2 import memory_profiler 3 4 # Add @ profile here to monitor the memory usage of a specific function 5 @ profile 6 def function (): 7 x = list (range (1000000) # allocate a big list 8 y = copy. deepcopy (x) 9 del x10 return y11 12 if _ name _ = "_ main _": 13 function ()

 

Run on Ubuntu:

 

The program has created 1,000,000 int values (1,000,000*12 bytes = ~ 11.4 MB) to create a reference variable x (1,000,000*8 bytes = ~ 3.8 MB), the total memory usage is about 15.2MB. thenCopy. deepcopyFor the in-depth copy operation and the creation of a new reference variable y, it also needs to occupy about 15.2 MB of memory, so the memory usage of 8th rows increased by 15.367 MB. note that the memory usage in line 1 and del x is only reduced by 9th MB, which indicates that the del operation only releases the memory space pointing to the list referenced variable, rather than the memory space occupied by integers in the list, these integers are retained in the heap, resulting in memory usage of nearly 11.4 MB.

In this example, a total of about 15.309 + 15.367-3.82 = ~ 26.8 MB, and we only need about MB of memory to store a list, more than doubled! Therefore, the memory usage may increase rapidly when we do not pay attention to it in programming!

Pickle

Pickle is a standard way to serialize Python objects to and from files. What is its memory footprint (memory footprint? Is it a smarter way to create additional data copies? Consider:

 1 import memory_profiler 2 import pickle 3 import random 4  5 def random_string(): 6     return "".join([chr(64 + random.randint(0, 25)) for _ in xrange(20)]) 7  8 @profile 9 def create_file():10     x = [(random.random(),11           random_string(),12           random.randint(0, 2 ** 64))13          for _ in xrange(1000000)]14 15     pickle.dump(x, open('machin.pkl', 'w'))16 17 @profile18 def load_file():19     y = pickle.load(open('machin.pkl', 'r'))20     return y21 22 if __name__=="__main__":23     create_file()24     #load_file()

 

This program is used to generate some pickle data and read the pickle data (the reading of the pickle data is commented here, first of all, it does not use the READ function to run), using memory_profiler, A large amount of memory is used to generate pickle data:

Let's take a look at the reading of pickle data (comment out the 23rd rows in the above program and remove the 24 rows ):

Therefore, pickle is very memory-consuming. From the figure above, we can see that during data creation, it uses approximately MB of memory, while a pickle. the dump operation requires an additional memory space equivalent to the data size (117 MB ).

In the unpickle operation (that is, the deserialization operation reads data from pkl), it seems to be more efficient, although it does occupy more than the original data (127 MB) the memory size (188 MB) is large, but it has not reached a factor of over 1.

In short, pickle-related operations should be avoided in programs with high memory capacity requirements. Is there any alternative? We know that pickle stores the structure of the data structure, that is, the data will not be retained (not only the data, but also the structure information of the data), so when we need it, restore data from the pickle file. However, not all times need to be saved using pickle. Just like the list in the above example, you can use a text-based file format to save the elements in order, there is no need to use pickle to save:

1 import memory_profiler 2 import random 3 import pickle 4 5 def random_string (): 6 return "". join ([chr (64 + random. randint (0, 25) for _ in xrange (20)]) 7 8 @ profile 9 def create_file (): 10 x = [(random. random (), 11 random_string (), 12 random. randint (0, 2 ** 64) 13 for _ in xrange (1000000)] 14 # Here we use text to save data instead of pickle15 f = open ('machin. flat ', 'w') 16 for xx in x: 17 print> f, xx18 f. close () 19 20 @ profile21 def load_file (): 22 y = [] 23 f = open ('machin. flat ', 'R') 24 for line in f: 25 y. append (eval (line) 26 f. close () 27 return y28 29 if _ name __= = "_ main _": 30 create_file () 31 # load_file ()

 

Memory footprint during file creation:

Next let's take a look at the memory footprint changes when reading data (30 lines of code are commented out and 31 lines of annotators are removed ):

The raw data is 127 MB, and the READ memory is 139 MB, which is very similar to the raw data. About 10 MB of memory is allocated to the temporary variables generated in the loop.

This example can reveal that when processing data, we should not read all the data first, but then process the data. Instead, we should read several items each time, finish processing these items, and release the space for these items, then read several items for processing, and so on. In this way, the previously allocated memory space can be reused. For example, to read data to a Numpy array, We can first create an empty array, then read data row by row, and fill in array row by row, in this way, we only need memory space that is about the same as the data size. If pickle is used, at least two times the data size of memory space should be allocated: one is occupied by pickle during load, and the other is used to create an array for storing data.

Summary

The objective of the Python design is fundamentally different from that of the C language. The latter allows programmers to better control what the program is to do at the cost of more complex and explicit programming, and the former is designed to make the code faster and to hide details as much as possible. Although it sounds good, in the production environment, ignoring the execution efficiency will generate a big heel, so in the Python code design process, it is very inefficient to know which codes are executed, it is important for the production environment to avoid such inefficient compilation as much as possible!

 

Source: http://deeplearning.net/software/theano/tutorial/python-memory-management.html#python-memory-management

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.