This guide describes how to use the Protocol buffer language to organize your protocol buffer data, including the syntax rules for. proto files, and how to generate data access class code through the. Proto file.
Defining a message type (to
define one)
syntax = "proto3";
message SearchRequest {
string query = 1;
int32 page_number = 2;
int32 result_per_page = 3;
}
- Only empty lines or comments before the syntax description (syntax)
- Each field consists of field restrictions, field type, field name, and number four
Specifying field Types (Specify field type)
In the example above, the message defines three fields, two int32 types, and a field of type string.
assigning Tags (Give numbers)
Each field in the message has a unique numeric type number. 1 to 15 uses one byte encoding, 16 to 2047 uses 2 byte encoding, so the number 1 to 15 should be reserved for frequently used fields.
The smallest number that can be specified is 1, and the maximum is 2^{29}-1 or 536,870,911. However, values from 19000 to 19999 cannot be used, and these values are reserved for protocol buffer.
Specifying field Rules (Specify field Limits)
- required: A field that must be assigned a value
- optional: Fields that are optional
- repeated: Repeatable field (variable length field)
Adding More Message Types (add more messages types)
A. proto file can define multiple message types:
message SearchRequest {
string query = 1;
int32 page_number = 2;
int32 result_per_page = 3;
}
message SearchResponse {
...
}
Adding Comments (add comment)
.protoThe file also uses C + + style annotation syntax//
message SearchRequest {
string query = 1;
int32 page_number = 2; // Which page number do we want?
int32 result_per_page = 3; // Number of results to return per page.
}
Reserved fields (reserved field)
If the field of a message is removed or commented out, but the user may reuse the field encoding, it can lead to problems such as data corruption, privacy vulnerabilities, and so on. One way to avoid this type of problem is to indicate that the deleted fields are reserved. The protocol buffer compiler emits an alarm if a user uses the number of these fields.
message Foo {
reserved 2, 15, 9 to 11;
reserved "foo", "bar";
}
What's Generated from Your. Proto? (Compile.protoFile
For C + +, each.protofile will be compiled.hwith one and one file corresponding to it.cc.
Scalar Value Types (type comparison table)
. Proto Type |
Notes |
C + + Type |
Double |
Double |
Double |
Float |
Float |
Float |
Int32 |
Uses variable-length encoding. Inefficient for encoding negative numbers–if your field was likely to has negative values, use Sint32 instead. |
Int32 |
Int64 |
Uses variable-length encoding. Inefficient for encoding negative numbers–if your field was likely to has negative values, use Sint64 instead. |
Int64 |
UInt32 |
Uses variable-length encoding. |
UInt32 |
UInt64 |
Uses variable-length encoding. |
UInt64 |
Sint32 |
Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. |
Int32 |
Sint64 |
Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. |
Int64 |
Fixed32 |
Always four bytes. More efficient than uint32 if values is often greater than 2^28 |
UInt32 |
Fixed64 |
Always eight bytes. More efficient than UInt64 if values is often greater than 2^56 |
UInt64 |
Sfixed32 |
Always four bytes. |
Int32 |
Sfixed64 |
Always eight bytes. |
Int64 |
bool |
bool |
Boolean |
String |
A string must always contain UTF-8 encoded or 7-bit ASCII text. |
String |
bytes |
may contain any arbitrary sequence of bytes. |
String |
Default values (defaults)
If you do not specify a default value, the system default is used, for the default value is an empty string, for the default value of FALSE, for the default value of 0, for the default value ofstringbool数值类型enumThe first element in the definition, the default value isrepeatednull.
Enumerations (enumeration)
message SearchRequest {
string query = 1;
int32 page_number = 2;
int32 result_per_page = 3;
enum Corpus {
UNIVERSAL = 0;
WEB = 1;
IMAGES = 2;
LOCAL = 3;
NEWS = 4;
PRODUCTS = 5;
VIDEO = 6;
}
Corpus corpus = 4;
}
By setting the optional parameterallow_aliasto True, you can use aliases in the enumeration structure (two value element values are the same)
enum EnumAllowingAlias {
option allow_alias = true;
UNKNOWN = 0;
STARTED = 1;
RUNNING = 1;
}
enum EnumNotAllowingAlias {
UNKNOWN = 0;
STARTED = 1;
// RUNNING = 1; // Uncommenting this line will cause a compile error inside Google and a warning message outside.
}
Because the enumeration values are varint encoded, the enumeration values are not recommended for negative numbers in order to improve efficiency. These enumeration values can be reused in other message definitions.
Using other message Types (using a different messaging type)
You can use the definition of one message as the field type of another message.
message SearchResponse {
repeated Result results = 1;
}
message Result {
string url = 1;
string title = 2;
repeated string snippets = 3;
}
Importing definitions (import definition)
Like a C + + header file, you can also import other. proto files
import "myproject/other_protos.proto";
If you want to move a.protofile, but do not want to modifyimportsome of the code in the project, you can leave an empty file in the original location of the file.proto, and then useimport publicthe new location after the import file is moved:
// new.proto
// All definitions are moved here
// old.proto
// This is the proto that all clients are importing.
import public "new.proto";
import "other.proto";
// client.proto
import "old.proto";
// You use definitions from old.proto and new.proto, but not other.proto
Nested Types (nested type)
The following nested types can be defined in protocol
message SearchResponse {
message Result {
string url = 1;
string title = 2;
repeated string snippets = 3;
}
repeated Result results = 1;
}
If you need to use a definition in another messageResult, you canParent.Type use it.
message SomeOtherMessage {
SearchResponse.Result result = 1;
}
protocol supports deeper nesting and grouping nesting, but it is not recommended to use deep nesting for structural clarity purposes.
message Outer { // Level 0
message MiddleAA { // Level 1
message Inner { // Level 2
int64 ival = 1;
bool booly = 2;
}
}
message MiddleBB { // Level 1
message Inner { // Level 2
int32 ival = 1;
bool booly = 2;
}
}
Updating a Message type (update one data type)
In the actual development there will be a scenario in which the message format has to be upgraded due to changes in some requirements, but some applications that use the original message format cannot be upgraded immediately, which requires us to follow certain rules when upgrading message formats. This ensures that new and old programs are running simultaneously based on the new and old message formats. The rules are as follows:
- Do not modify the label number of a field that already exists.
- Any newly added fields must be optional and repeated qualifiers, or the new and old programs will not be guaranteed message compatibility when they pass messages to each other.
- In the original message, you cannot remove the existing required field, the fields of optional and repeated types can be removed, but the tag numbers they used before must be preserved and cannot be reused by new fields.
- Int32, UInt32, Int64, UInt64, and bool are compatible between types, Sint32 and Sint64 are compatible, string and bytes are compatible, FIXED32 and SFIXED32, and FIXED64 and SFIXED64 are compatible, which means that if you want to modify the type of the original field, you can only modify it to a type that is compatible with its original type for compatibility, otherwise the compatibility of the new and old message format will be broken.
- The optional and repeated qualifiers are also mutually compatible.
Any (arbitrary message type)
AnyA type is a.prototype of message that you can use directly without defining it in a file, using a pre-importgoogle/protobuf/any.protofile.
import "google/protobuf/any.proto";
message ErrorStatus {
string message = 1;
repeated google.protobuf.Any details = 2;
}
C + + usesPackFrom()andUnpackTo()methods to package and packageAnytype messages.
// Storing an arbitrary message type in Any.
NetworkErrorDetails details = ...;
ErrorStatus status;
status.add_details()->PackFrom(details);
// Reading an arbitrary message from Any.
ErrorStatus status = ...;
for (const Any& detail : status.details()) { if (detail.Is<NetworkErrorDetails>()) { NetworkErrorDetails network_error;
detail.UnpackTo(&network_error); ... processing network_error ... }
}
Oneof (one of the field types)
A bit like a union in C + +, that is, multiple field types in a message only one field is used at the same time, usingcase()orWhichOneof()methods to detect which field is used.
Using oneof (with oneof)
message SampleMessage {
oneof test_oneof {
string name = 4;
SubMessage sub_message = 9;
}
}
You can addrepeatedany type of field to theOneofdefinition except the outside
Oneof Features (oneof characteristics)
-
The oneof field is only valid for the last set of fields, that is, the subsequent set operation overrides the previous set operation
SampleMessage message;
message.set_name("name");
CHECK(message.has_name());
message.mutable_sub_message(); // Will clear name field.
CHECK(!message.has_name());
- Oneof can't berepeated.
- The reflection API can be used for oneof fields
-
If you use C + + to prevent memory leaks, the subsequent set operation overrides the previous set operation, causing the Field object previously set to be refactored, and note the pointer operation of the Field object
SampleMessage message;
SubMessage* sub_message = message.mutable_sub_message();
message.set_name("name"); // Will delete sub_message
sub_message->set_... // Crashes her
-
If you use the C + +Swap()method to exchange two oneof messages, neither message will save the previous field
SampleMessage msg1;
msg1.set_name("name");
SampleMessage msg2;
msg2.mutable_sub_message();
msg1.swap(&msg2);
CHECK(msg1.has_sub_message());
CHECK(msg2.has_name());
Backwards-compatibility issues (backwards compatible)
When adding or removingoneoffields, be aware that ifoneofthe return value of a field is detected asNone/NOT_SET, which means thatoneofthere is no setting or setting a different version ofoneofthe field, there is no way to distinguish between the two cases. Because there is no way to confirm whether an unknown field is aoneofmember.
Tag Reuse Issues (number multiplexing issue)
- Delete or add a field to oneof: Some information will be lost after the message is serialized or parsed, and some fields will be emptied
- Delete a field and add it again: Clears the oneof field of the current setting after the message is serialized or parsed
- Split or merge fields: Same as normal delete field Operation
Maps (table map)
Protocol buffers provides an introduction to the syntax to implement the map type:
map<key_type, value_type> map_field = N;
key_typeCan be abytesbase type other than a floating-point pointer or outside, whichvalue_typecan be any type
map<string, Project> projects = 3;
- Map fields cannot be duplicated (repeated)
- The iteration order of the linear order and map values is undefined, so the elements of the map cannot be expected to be ordered
- Maps can be sorted by key, and keys of numeric types are sorted by comparing values
- When a linear parsing or merging occurs, the last key will be used if a duplicate key value is present. Resolves the map from text format and fails if duplicate key value is present.
Backwards compatibility (backwards compatible)
The following expressions in the map syntax are linearly equivalent, so even if protocol buffers does not implement the maps data structure, it does not affect the processing of the data:
message MapFieldEntry {
key_type key = 1;
value_type value = 2;
}
repeated MapFieldEntry map_field = N;
Package
A C + +-like namespace to prevent name collisions
package foo.bar;
message Open { ... }
You can use the package specifier to define your message fields:
message Foo {
...
foo.bar.Open open = 1;
...
}
Defining services
If you want to use the message type in the RPC system, you need to.protodefine the RPC service interface in the file and then use the compiler to generate the corresponding language stub.
service SearchService {
rpc Search (SearchRequest) returns (SearchResponse);
}
JSON mapping
Proto3 supports encoding in JSON format. If no value or value is NULL for the encoded JSON data, protocol buffer will use the default value when parsing, which saves space when encoding json.
Proto3 |
JSON |
JSON Example |
Notes |
Message |
Object |
{"FBar": V, "G": null, ...} |
Generates JSON objects. Message field names is mapped to Lowercamelcase and become JSON object keys. is accepted and treated as thenulldefault value of the corresponding field type. |
Enum |
String |
"Foo_bar" |
The name of the enum value as specified in Proto is used. |
map< k,v> |
Object |
{"K": V, ...} |
All keys is converted to strings. |
Repeated V |
Array |
[V, ...] |
nullis accepted as the empty list []. |
bool |
True, False |
True, False |
|
String |
String |
"Hello world!" |
|
bytes |
Base64 string |
"Ywjjmtizit8kkiyoksctpub+" |
|
Int32, Fixed32, UInt32 |
Number |
1,-10, 0 |
JSON value would be a decimal number. Either numbers or strings is accepted. |
Int64, FIXED64, UInt64 |
String |
"1", "10" |
JSON value would be a decimal string. Either numbers or strings is accepted. |
float, double |
Number |
1.1, -10.0, 0, "NaN", "Infinity" |
JSON value would be a number or one of the special string values "NaN", "Infinity", and "-infinity". Either numbers or strings is accepted. Exponent notation is also accepted. |
Any |
Object |
{"@type": "url", "F": V, ...} |
If The any contains a value of the has a special JSON mapping, it'll be converted as follows:{"@type": xxx,<wbr style="box-sizing: inherit;"> "value": yyy}. Otherwise, the value would be converted into a JSON object, and the"@type"field would be inserted to indicate the actual dat A type. |
Timestamp |
String |
"1972-01-01T10:00:20.021Z" |
Uses RFC 3339, where generated output would always be z-normalized and Uses 0, 3, 6 or 9 fractional digits. |
Duration |
String |
"1.000340012s", "1s" |
Generated output always contains 0, 3, 6, or 9 fractional digits, depending on required precision. Accepted is any fractional digits (also none) as long as they fit into nano-seconds precision. |
Struct |
Object |
{ ... } |
Any JSON object. See Struct.proto. |
Wrapper types |
Various types |
2, "2", "foo", True, "true", NULL, 0, ... |
Wrappers use the same representation in JSON as the wrapped primitive type, except. isnullallowed and preserved dur ing data conversion and transfer. |
Fieldmask |
String |
"F.foobar,h" |
See Fieldmask.proto. |
ListValue |
Array |
[foo, bar, ...] |
|
Value |
Value |
|
Any JSON value |
Nullvalue |
Null |
|
JSON NULL |
Options
Protocol buffer allows us to define some common options in the. proto file, which instructs the Protocol buffer compiler to help us generate a more matching target language code. Protocol buffer The built-in options are divided into the following three levels:
At the file level, such an option will affect all messages and enumerations defined in the current file.
At the message level, such an option only affects a message and all the fields it contains.
At the field level, such an option only responds to fields associated with it.
Some of the commonly used protocol buffer options are given below.
-
optimize_for(File options): The values you can set haveSPEED,CODE_SIZEorLITE_RUNTIME, different options affect the generation of C + + code in the following ways (option optimize_for = CODE_SIZE;).
-
SPEED (default): The protocol buffer compiler will generate serialization, parsing, and other ways to efficiently manipulate message types. This is also the highest optimization option. Make sure the generated code is large.
CODE_SIZE: The protocol buffer compiler will generate the smallest class, which is determined to be slower than SPEED.
LITE_RUNTIME: The protocol buffer compiler will generate classes that rely only on the "lite" runtime library (libprotobuf-lite instead of libprotobuf). The lite runtime library is smaller than the entire library but removes features such as descriptors and reflection. This option is usually used. Optimization of the mobile phone platform.
- cc_enable_arenas(File options): generated C + + code enable arena allocation memory management
-
deprecated(File options):
Resources
Protocol Buffer Official Document
Protocol Buffer Usage Profile
Protocol Buffer Technical Explanation (language specification)
Protocol Buffers Official documentation (PROTO3 language Guide)