Python Beginner
Reply content:
Thank you for your invitation.
Many introductory tutorials explain that serialization is typically the process:
Object 1-serialization--byte string--deserialization--Object 2
So many people don't know why to serialize.
It is estimated that many people have heard that Python does not perform well when dealing with computationally intensive tasks and generally does not fully utilize the benefits of multicore CPUs, which are optimized using multiple processes.
There is a multi-process calculation method, the process is divided into master and Worker,master responsible for scheduling tasks, the worker is dedicated to the calculation, such as celery this library.
So the problem is, Master has created a task that needs to be given to the worker to calculate, because the memory between processes is isolated and the worker cannot access the Task object directly.
So master needs to represent the object in some way to the worker, and the worker can construct the object (the alias) based on that representation, which is serialization and deserialization.
And Pickle is a kind of serialization in Python, good support for Python objects, and this is the reason why celery uses pickle by default, is the celery dependent on pickle?
。
From the point of view of serialization, there is no essential difference between pickle scheme and json,yaml,xml.
However, the pickle is not secure enough to never deserialize the pickle byte string of the non-trusted source, so the pickle scheme is not suitable for network traffic. Thank you for your invitation. Pickle can use a string representation of variables in almost any format (all built-in types + support pickle class instances), which is often used to store intermediate results.
What does that mean? For example, one day you wrote a program to run for a long time, so you decided to add the "Save current progress to file" function, so that today can not run out, tomorrow will be able to read the archive to continue today's progress.
But here's the problem: "Current progress" is not necessarily a string, it could be a list, or a dictionary, or a collection, or even an instance of a class ... How can such a messy thing be written in a file?
So the pickle is useful.
>>>Import Pickle>>>Data = {... ' 1 ': True,... 23.45: Str,... Print: Set(),... b' Hello ': [0,0,0],... }>>>Pickle.dumps(Data)B ' \x80\x03}q\x00 (g@7s33333cbuiltins\nstr\nq\x01cbuiltins\nprint\nq\x02cbuiltins\nset\nq\x03]q\x04\x85q\ X05rq\x06x\x01\x00\x00\x001q\x07\x88c\x05helloq\x08]q\t (k\x00k\x00k\x00eu. '>>>Pickle.loads(_){23.45:
,
: Set (), ' 1 ': True, b ' Hello ': [0, 0, 0]}
Have you ever played a game? You know save/load? The file function that comes with Python can only store and read data in string format.
Pickle can be stored and read into other formats such as List dict data, which is called serialization and deserialization, your data structure to convert to a string, can be saved to a file, convenient for the next quick recovery, also can be transmitted over the network before the time to write the crawler ... Accidentally wrote the collapse ... There was no database at the time, it was automatically recovered with this goods. For example, to build a machine learning model, it is said that decision tree, the general decision tree model is the first achievement, and then pruning, and then make predictions, but there is a bad place is, obviously is the same tree run test data, but each time to re-build a tree, and the decision tree most of the time wasted in the achievements, So I can use pickle to save the whole tree during the first full run, and then run the test directly when the load comes in to predict or prune, which saves a lot of time. Serialization is useful, need to use serialization of the scene has a session in other high-level language, the object serialization is a troublesome thing, you want to encode the object, split into a string, and then deposited into the file. Reverse serialization, that is, read, restore to an object, to decode, intercept the string, to restore objects. With pickle, you can easily do this with dump and load. To put it bluntly, it is a tool for storing and retrieving Python data, which is stored in a file in another simple form, then conveniently transferred and propagated, and then restored back in the same way as pickle.