Protocol Buffers--python Practice (ii) Protocol buffers vs JSON

Source: Internet
Author: User
Tags serialization

Why do you specialize in opening a pit to use pb. A final reason to give up on a JSON that is well-supported on each platform is to make it lighter, faster, and simpler in the case of a strong type and cross-platform. Since this is the goal to go, how fast I need a reasonable explanation.

In the case of the pure Python official library, the PB and JSON standard libraries are compared with the speed of the Simplejson library.

The. proto file file that is used is as follows:

" Proto2 " ;p ackage hello_word;message sayhi {    = 1;     = 2;     = 3;}

The python file can be generated according to the corresponding Sayhi Obejct.

The code to test the serialization speed of each library is as Follows:

#Coding:utf-8ImportTimeit#Serialization ofx ="""Say_hi. Serializetostring ()"""y="""json.dumps (ppa)"""Z="""simplejson.dumps (pl)"""PrintMin (timeit.repeat (stmt=x, setup="Import say_hi_pb2;"                                  "Say_hi = say_hi_pb2. Sayhi ();"                                  "say_hi.id = 13423;"                                  "say_hi.something = ' Axiba ';"                                  "say_hi.extra_info = ' Xiba ';", repeat=5, number=100000))PrintMin (timeit.repeat (stmt=y, setup="Import json;"                                  "ppa={"                                  "' ID ': 13423,"                                  "' something ': ' Axiba ',"                                  "' extra_info ': ' Xiba ',"                                  "};", repeat=5, number=100000))PrintMin (timeit.repeat (stmt=z, setup="Import simplejson;"                                  "pl={"                                  "' ID ': 13423,"                                  "' something ': ' Axiba ',"                                  "' extra_info ': ' Xiba ',"                                  "};", repeat=5, Number=100000))

Output:

1.08438277245
0.398800134659
0.707333087921

The code to test the deserialization speed of each library is as Follows:

#Coding:utf-8ImportTimeit#deserializationx ="""Say_hi. Parsefromstring (p)"""y="""json.loads (p1)"""Z="""simplejson.loads (p2)"""PrintMin (timeit.repeat (stmt=x, setup="Import say_hi_pb2;"                                  "Say_hi = say_hi_pb2. Sayhi ();"                                  "say_hi.id = 13423;"                                  "say_hi.something = ' Axiba ';"                                  "say_hi.extra_info = ' Xiba ';"                                  "p = Say_hi. Serializetostring ()", repeat=5, number=100000))PrintMin (timeit.repeat (stmt=y, setup="Import json;"                                  "ppa={"                                  "' ID ': 13423,"                                  "' something ': ' Axiba ',"                                  "' extra_info ': ' Xiba ',"                                  "};"                                  "p1 = json.dumps (ppa)", repeat=5, number=100000))PrintMin (timeit.repeat (stmt=z, setup="Import simplejson;"                                  "pl={"                                  "' ID ': 13423,"                                  "' something ': ' Axiba ',"                                  "' extra_info ': ' Xiba ',"                                  "};"                                  "P2 = simplejson.dumps (pl)", repeat=5, Number=100000))
Output:

0.924090862274
0.492631912231
0.283575057983

As can be seen from the above data, in the case of the version 3.1.0.post1 I used, pure python implemented PB serialization slightly slower than the JSON native library twice times more, 30 slower than the Simplejson library. In the deserialization speed test, the PB speed is still the slowest twice times slower than the native JSON library, more than 3 times times more than the Simplejson library. So it seems that the gap is not so much optimized. Remember that before using the pb2.x library, python serialization is often slower than Simplejson 3 times times more than the normal thing. Each analysis of the performance of the article can be seen too slow this description. Because of the binary storage, as well as the PB unique encoding binary way, from the size of the point of view, PB is far less than json, but the speed even JSON is fast, we have no reason to abandon the use of easy to rely on the use of JSON instead of pb? That's really not convincing.

however, PB officially provides a C + + implementation of runtime for python, according to the method in practice, install the latest PB library, and follow the documentation compiled, and then install the Python C + + implementation, You can let PB use C + + implementation of the serialization of the Deserialization. Other generated code and so on all do not change, the calling code does not change, just need to install it. You can see it after Installation.

Using/users/piperck/desktop/grpc/lib/python2.7/site-packages
Finished processing dependencies for protobuf==3.1.0

Once again using the PIP list to view our PB can be found, has been replaced by the Library.

Let's rerun the serialized and deserialized Code:

serialized output:0.0857851505280.4031720161440.755691051483 deserialization output: 0.0902311801910.4997339248660.297739028931

You can see nearly 10 times times faster than pure Python implementations. If the serialization and deserialization are computed in one calculation, it is also 4 to 5 times times faster than the Simplejson library we normally use. In applications that frequently call serialization deserialization again, It can be said that a larger performance increase can make your code lighter and faster, and strongly typed mappings can check for Errors.

Don't think You're done Here. There is also a faster library, but now only supports proto2, called pyrobuf library. based on the CPython implementation, according to the author, he is 2-4 times faster than C + + backend for Python. This because I installed a half-day not installed, it seems that the configuration of the CPython library a little bit of a problem, if you have faster requirements for speed, you can view the reference give the second link, to Explore.

Reference:

Https://github.com/google/protobuf/tree/master/python Pb-github Library

Https://github.com/appnexus/pyrobuf Pyrobuf Library

Protocol Buffers--python Practice (ii) Protocol buffers vs JSON

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.