Protocol buffers (PROTOBUF) Official Document--PROTOBUF language guide

Source: Internet
Author: User
Tags scalar

Protocol buffers (PROTOBUF) Official Document--PROTOBUF language guide

Convention: For the convenience of writing, protocolbuffers in the following will have PROTOBUF replaced.

This guide will describe how to use protobuf to define I structured protobuf data, including the. proto file syntax and how to use the. proto file to generate data access classes.

As a reference guide, this document introduces you to the features of protobuf in the form of examples. You can refer to an example of the language you have selected. Tutorial

--------------------------------------a small dividing line-----------------------------------------

Define a message type

First, look at a very simple example, for example, you want to define a search request message, each search request has a query string (keywords: such as we on Baidu search "report boss"), and we search out a page of interest, as well as the total number of pages searched. Let's see how this. proto file is defined.

1 message searchrequest {2   Required String query = 1;3   optional Int32 page_number = 2;4   optional int32 result_ Per_page = 3;5}

This "Search request" message specifies three fields (name/attribute combinations), each one you want to include in this type of information, must have a field, each field has a name and type!

Specify field type

In the example above, all the fields are scalar types (scalar types): two integers (integers: page_number and result_per_page)和一个字符串(string:query:查询的关键字),不过你可以在你的字段内指定符合类型。包括枚举类型(enumerations) other message types

Assign a specified label number

As you can see, each message field has a unique numeric label that is used to indicate where your field is in the binary message (message binary format). And once the tag number is specified, it is not changed during use, the tag number in the range of 1-15 each field needs to use 1 bytes to encode this byte including the location of the field and the type of field! (For more information on encoding, please click Protocol Buffer Encoding). The tag number in 16-2047 needs to be encoded with 2 bytes. So you'd better keep the 1-15 tag number as a field that is frequently used. If you might add some frequently used elements in the future, remember to leave some 1-15 tag numbers.

The minimum label number that can be specified is 1, and the maximum tag number is 229-1 or 536870911. Cannot use 19000-19999 tag number (Fielddescriptor::kfirstreservednumber to Fielddescriptor::klastreservednumber) These tag numbers are reserved for PROTOBUF internal implementations, and if you use these tag numbers within the. proto file protobuf The compiler will give an error!

Specify field Rules

The message field can be specified in the following three ways:

    • required: This field must be owned in the complete message. This field is a must-have (both sides have to have)
    • optional: This field is optional in the complete message and can be owned or not (optional for both)
    • repeated: The value of this field in the complete message can have any number of duplicate values saved. (optional, array on both sides)

Because of historical reasons: the repeated field cannot be encoded if it is a basic numeric type. The new code should use a special keyword [packed=true] to make it valid for encoding.

[CPP]View Plaincopyprint?
    1. Repeated int32 samples = 4 [packed=true];
[CPP]View Plaincopyprint?
    1. Repeated int32 samples = 4 [packed=true];

Note: You should be careful to set the field to required, if you want to cancel the read and write of the required field in some cases, it will change the fields optional property, and the old reader will consider this message incomplete. It may inadvertently be discarded. You should consider customizing a message checker. Some of Google's engineers think the benefits of using the Optinal field are greater than required. But obviously this view is not universal.

Add more message types

Multiple message types can be defined within the same. proto file, which is useful for defining multiple associated messages. For example, if you want to define a reply SearchResponse message, you can add it in. Proto like this.

[CPP]View Plaincopyprint?
    1. Message SearchRequest {
    2. Required String query = 1;
    3. Optional Int32 page_number = 2;
    4. Optional Int32 result_per_page = 3;
    5. }
    6. Message SearchResponse {
    7. ...
    8. }
[CPP]View Plaincopyprint?
    1. Message SearchRequest {
    2. Required String query = 1;
    3. Optional Int32 page_number = 2;
    4. Optional Int32 result_per_page = 3;
    5. }
    6. Message SearchResponse {
    7. ...
    8. }

Add Comment

Comments are added in the same way as in C + +. Use//

[CPP]View Plaincopyprint?
    1. Message SearchRequest {
    2. Required String query = 1;
    3. Optional Int32 page_number = 2;  //Which page number do we want?
    4. Optional Int32 result_per_page = 3;  //number of results to return per page.
    5. }
[CPP]View Plaincopyprint?
    1. Message SearchRequest {
    2. Required String query = 1;
    3. Optional Int32 page_number = 2;  //Which page number do we want?
    4. Optional Int32 result_per_page = 3;  //number of results to return per page.
    5. }

What will the. proto file generate?

When you use the Protobuf compiler to compile a. proto file, it generates the operation code for the type of message you describe in the. Proto, which is based on the language of the programming feature you choose. These operating codes contain the setting of field values and the reading of field values, as well as serialization to the output stream and deserialization from the input stream.

C + +: the compiler generates its corresponding. h and. cc files according to each. proto file, and each message resembles a separate message action class.

java: The compiler will generate a. java file and an action class that is common to all message types and uses a special one Builder类为每个消息类型实例化 .

Python: A little different – the compiler generates a module for each message each module has a static descriptor that creates a required data manipulation class at run time with a meta-class.

From the example of your chosen language, you can find out more about the API and need more information about the API, refer to: API reference.

Scalar value types

A field of a message if you want to use a scalar to make it the following types – This table shows the types that can be specified within the. proto file, with the auto-generated relative types!

. Proto Type Notes C + + Type Java Type Python Type[2]
Double Double Double Float
Float Float Float Float
Int32 Uses variable-length encoding. Inefficient for encoding negative numbers–if your field was likely to has negative values, use Sint32 instead. Int32 Int Int
Int64 Uses variable-length encoding. Inefficient for encoding negative numbers–if your field was likely to has negative values, use Sint64 instead. Int64 Long INT/LONG[3]
UInt32 Uses variable-length encoding. UInt32 INT[1] INT/LONG[3]
UInt64 Uses variable-length encoding. UInt64 LONG[1] INT/LONG[3]
Sint32 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. Int32 Int Int
Sint64 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. Int64 Long INT/LONG[3]
Fixed32 Always four bytes. More efficient than uint32 if values is often greater than 228. UInt32 INT[1] Int
Fixed64 Always eight bytes. More efficient than UInt64 if values is often greater than 256. UInt64 LONG[1] INT/LONG[3]
Sfixed32 Always four bytes. Int32 Int Int
Sfixed64 Always eight bytes. Int64 Long INT/LONG[3]
bool bool Boolean Boolean
String A string must always contain UTF-8 encoded or 7-bit ASCII text. String String STR/UNICODE[4]
bytes may contain any arbitrary sequence of bytes. String ByteString Str

You can Encoding in protocol Buffer. Find out more about how these types are encoded and how to serialize the information that defines messages!

[1] in Java, unsigned 32-bit and 64-bit integers correspond to their symbols, and the highest bits are used to hold symbols!

[2] In all cases, setting the value of a field will perform a type check to make sure its value is legal!

[3] A 64-bit or 32-bit unsigned integer is decoded in a long decoding, the value of the field can be an int. But in all cases, the assignment is shifted to its target type. See [2].

[4] The Python string is described in Unicode when decoded, but the same can be assigned to an ASCII string (this is the implication).

Optional field with its default value

As described above, when describing a message, you can specify a field constraint with optional, a message can or may not contain a optional element. When a message is parsed, if it does not have a optional field, the parsed message object sets its relative field to the default value of its field. This default value can be specified when describing a message. For example. For example, you want to set SearchRequest的result_per_page的默认值为10.

[CPP]View Plaincopyprint?
    1. Optional Int32 result_per_page = 3 [default = 10];
[CPP]View Plaincopyprint?
    1. Optional Int32 result_per_page = 3 [default = 10];

If a optional field is not assigned its default value. Its default value is automatically replaced by:

1. String: An empty string.

2.bool: False.

3. Number type: 0;

4. Enumeration value: The first enumeration value

Enumeration values

When you define the message format, perhaps you want one of the values of the field to be one of the predefined values in the Class table. For example, when SearchRequest消息中 you want to define a corpus field, the value of the Corpus field can be: ",,,, UNIVERSAL WEB IMAGES LOCAL NEWS , PRODUCTS or VIDEO" . You can simply add an enumeration type to your message-an enumerated field type whose value specifies a collection that is specified as a constant (if you try to assign a different value, the parser will consider the field to be an unknown field). In the following example we specify the Corpus field as the enumeration type and its possible values:

[CPP]View Plaincopyprint?
  1. Message SearchRequest {
  2. Required String query = 1;
  3. Optional Int32 page_number = 2;
  4. Optional Int32 result_per_page = 3 [default = 10];
  5. enum Corpus {
  6. UNIVERSAL = 0;
  7. WEB = 1;
  8. IMAGES = 2;
  9. LOCAL = 3;
  10. NEWS = 4;
  11. Products = 5;
  12. VIDEO = 6;
  13. }
  14. Optional Corpus Corpus = 4 [default = UNIVERSAL];
  15. }
[CPP]View Plaincopyprint?
  1. Message SearchRequest {
  2. Required String query = 1;
  3. Optional Int32 page_number = 2;
  4. Optional Int32 result_per_page = 3 [default = 10];
  5. enum Corpus {
  6. UNIVERSAL = 0;
  7. WEB = 1;
  8. IMAGES = 2;
  9. LOCAL = 3;
  10. NEWS = 4;
  11. Products = 5;
  12. VIDEO = 6;
  13. }
  14. Optional Corpus Corpus = 4 [default = UNIVERSAL];
  15. }

You can define an alias for an enumeration constant, and if you need to do this, you need to set Allow_alias to true. Otherwise, the compiler will error if an alias is present!

[CPP]View Plaincopyprint?
    1. Enum Enumallowingalias {
    2. Option Allow_alias = true;
    3. UNKNOWN = 0;
    4. STARTED = 1;
    5. RUNNING = 1;
    6. }
    7. Enum Enumnotallowingalias {
    8. UNKNOWN = 0;
    9. STARTED = 1;
    10. //RUNNING = 1; An error exception is thrown if you do not comment on this line
[CPP]View Plaincopyprint?
    1. enum enumallowingalias {   
    2.   option allow_alias = true;  
    3.   UNKNOWN = 0;  
    4.   started =  1;  
    5.   RUNNING = 1;  
    6. }  
    7. enum enumnotallowingalias {  
    8.   unknown = 0;  
    9.   started = 1;   
    10.   // running = 1;  //will throw an error exception if this line is not commented   

The range of enumeration values must be within a 32-bit integer. The encoding of the enumeration value uses a variable-length integer, and negative numbers are very inefficient, so it is not recommended. You can define an enumeration type within a message, such as the example above. Or it can be defined outside the message. These enumeration types can be reused within the. proto file, and you can define an enumeration type inside the message. Then use it in a different message type! When MessageType.EnumType来访问。 you run the compiler to compile an enumeration type in a. proto file, the generated code will have a corresponding enumeration value (JAVA or C + +), or a special Enumdescriptor class (Python) to generate a set of symbolic constants at run time.

More information on enumeration types query generated code guide Select the language you are using.

Cond....................

Protocol buffers (PROTOBUF) Official Document--PROTOBUF language guide

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.