Google's protocol buffer learning

Last Update:2018-08-13 Source: Internet

Author: User

Tags readable

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Protocol buffers is a new data interchange format proposed by Google. The English definition of it is: a language-neutral, platform-neutral, extensible way of serializing structured data for use in communications Protocols, data storage, and more. Note on the Protocol Buffer Development Documentation: This documentation was aimed at Java, C + +, or Python developers who want to use protocol buffers in their applications.
This is overview introduces protocol buffers and tells you what
–you can then go in to follow the tutorials or delve deeper into protocol buffer encoding.
API reference documentation is also provided to all three languages, as as a language and style guides for writing. PR Oto files.
What are protocol buffers?

Protocol buffers are a flexible, efficient, automated mechanism for serializing structured
But smaller, faster, and simpler. Define how do you want your The data to be structured once,
Then can use special generated source code to easily write and read your structured data to and from a variety of data Streams
and using a variety of languages. You can even update your data structure without breaking deployed programs this are compiled the "old" format.
How do they work?

Specify how do you want the information you ' re serializing to be structured by defining protocol buffer message types in . proto files.
Each protocol-a small logical record of information, containing a series of name-value pairs.
Here's a very basic example of a. Proto file that defines a, Contai ning information about a person: Message person {Required String name = 1;
Required Int32 id = 2;
Optional String email = 3;

Enum Phonetype {
MOBILE = 0;
home = 1;
WORK = 2;
}

Message PhoneNumber {
Required String number = 1;
Optional Phonetype type = 2 [default = home];
}

Repeated PhoneNumber phone = 4;
}

As you can, the message format is Simple–each message type has one or more uniquely numbered fields,
And each field has a-name and a value type, where value types can be numbers (integer or floating-point), Booleans, string S, raw bytes,
Or even (as in the example above), and protocol buffer message types, allowing your to structure your data hierarchically.
You can specify optional fields, required fields, and repeated fields.
Can find more information about writing. proto files in the Protocol Buffer Language Guide.

As you can see, the message format is simple: Each message type has one or more data items, each with a name and a data type. The data type can be numeric (cosmetic or floating-point),
Boolean, String, byte-stream, or custom buffer type, allows you to construct the data system for the frame.
You can specify optional data items, required data items, and duplicate data items. For more information on how to write. proto files, you can get more from the protocol Buffer language Guide.
Once you ' ve defined your messages, your run the protocol buffer compiler for your application ' s language on your. Proto E to generate data access classes.
These provide simple accessors for each field (like query () and Set_query ()) as OK as methods to serialize/parse the WHO Le structure to/from Raw bytes–
So, for instance, if your chosen language is C + +, running the compiler on the above example would generate a class called P Erson.
You can then the use this class in the your application to populate, serialize, and retrieve the person protocol buffer messages.
You are might then write some code like this:

Once you have defined the message, you can compile it protocol the buffer compiler and generate a data access class from the. Proto file. (similar to CORBA IDL)
These classes provide a simple way to access data items, like query (), Set_query ()
You can use these classes in your application to construct, serialize, and retrieve the protocol buffer message of person. You can write the following code:
Person of person;
Person.set_name ("John Doe");
PERSON.SET_ID (1234);
Person.set_email ("jdoe@example.com");
FStream output ("MyFile", Ios::out | ios::binary);
Person. Serializetoostream (&output);

Then, later on, your could read your message back in:
Then, read back the information from the file

FStream input ("myfile", Ios::in | ios::binary);
Person of person;
Person. Parsefromistream (&input);
cout << "Name:" << person.name () << Endl;
cout << "e-mail:" << person.email () << Endl;

You can add the new fields to your message formats without breaking backwards-compatibility;
Old binaries simply ignore is the new field when parsing. So if you have a communications protocol this uses protocol buffers as its data format,
You can extend your protocol without has to worry about breaking code.

You can add data items without considering forward compatibility; the old code simply ignores the new entries.
If you use protocol buffer as your communication protocol, you can extend your protocol without worrying about affecting existing code.

You'll find a complete reference for using generated protocol buffer code in the API reference section,
And you can find out more about how to protocol buffer messages are encoded in protocol buffer Encoding.

You can find complete references in the API documentation and be able to understand how the protocol is encoded.

Why not just use XML?

Protocol buffers have many advantages over XML for serializing data. Protocol buffers:

* are simpler
* are 3 to smaller
* are to The Times faster
* are less ambiguous
* Generate data access classes that are easier to use programmatically

For example, let's say you are want to model a person with a name and an email. In XML, your need to do:
Why not use XML?
The protocol buffer has many advantages that XML does not have:
1. Simple;
2. Small: 3-10 times
3. High efficiency: 20-100 times
4. No ambiguity
5. There are automatic tools to generate access classes; (in fact, ASN.1, CORBA has similar tools)

For example, the person model uses XML to represent
<person>
<name>john doe</name>
<email>jdoe@example.com</email>
</person>

While the corresponding protocol buffer message (in protocol buffer text format) is:
The corresponding protocol text format

# Textual representation of a protocol buffer.
# This is *not* the binary format used on the wire.
person {
Name: "John Doe"
Email: "Jdoe@example.com"
}

When it is encoded to the protocol buffer binary format (the text format above is just a convenient
Human-readable representation for debugging and editing), it would probably long bytes and take around Econds to parse.
The XML version is at least-bytes if you remove whitespace, and would take around 5,000-10,000 nanoseconds to parse.

When the message is encoded into a binary format (the instructions above are just for compiling the reading representation), protocol buffer will be almost 28 segments long, with 100-200ns time parsing.
And the XML file has 69 bytes long, but also to remove whitespace, using 5000-10000ns to resolve

Also, manipulating a protocol buffer is very much easier:
Maintenance to be easy:

cout << "Name:" << person.name () << Endl;
cout << "e-mail:" << person.email () << Endl;

Whereas with XML your would have to does something like:
and XML has to do the following things:

cout << "Name:"
<< person.getelementsbytagname ("name")->item (0)->innertext ()
<< Endl;
cout << "e-mail:"
<< person.getelementsbytagname ("email")->item (0)->innertext ()
<< Endl;

However, protocol buffers are not always a better solution than xml–
For instance, protocol buffers would not being a good way to model a text-based document with markup (e.g. HTML),
Since your cannot easily interleave structure with text. In addition, the XML is human-readable and human-editable; Protocol buffers,
At least in their native format, are not. XML is also–to some extent–
Self-describing. A protocol buffer is only meaningful if you have the message definition (the. proto file).

However, protocol buffers is not always better than XML-for example, protocol buffers is not suitable for describing symbolic text, such as HTML, because you can't organize text well.
In addition, XML is easier to read and edit. Protocols buffers is also not self-describing (I don't know what that means.) ）

Also, protocol buffers has been widely used within Google.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More