Potocol Buffer and potocolbuffer

Last Update:2015-07-17 Source: Internet

Author: User

Tags comparison table xml parser

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Potocol Buffer and potocolbuffer
Install and use protocol

The previous blog post introduced a comprehensive case, which will introduce protocol buffer in detail.

Why protocol buffer? How to install it?

First, you need to use protocol buffer to ensure maven installation is successful, maven: http://maven.apache.org/download.cgi.

1. After decompression, configure the maven bin directory to your environment variables.

2. Make sure that your JAVA_HOME variable is directed to your JDK home directory. If your system variable does not contain JAVA_HOME, click Create and Add.

3. Open the command line and enter "mvn -- version". If the output is correct, the installation is successful.

After installing maven, you need to install protocol buffer,: http://code.google.com/p/protobuf/downloads/list. Download protobuf-2.4.1.zip and protoc-2.4.1-win32.zip packages.

1. after decompression, there are two options: 1. Configure the directory where protoc.exe is located in protoc-2.4.1-win32to the environment variable, and 2. Copy protoc.exe to the c: \ windows \ system32 directory. The second method is recommended here.

2. Copy the proto.exe file to the decompressed protobuf-2.4.1 \ src directory.

3. go to the protobuf-2.4.1 \ java directory and execute the mvn package command to edit the package. The system will generate the protobuf-java-2.4.1.jar file in the target directory (note that networking is required at runtime, and the first installation may take some time ).

4. If your data file directory is in the XXX \ data directory, copy the jar file generated in the previous step to this directory.

5. enter the XXX \ protobuf-2.4.1 \ examples directory, You can see addressbook. proto file, run protoc -- java_out = in the command line. addressbook. proto command (note. addressbook. the space in proto is repeated because I did not pay attention to the installation for the first time.) If the com folder is generated and the AddressBookProtos class is generated, the installation is successful.

6. Open eclipse, choose windows --> preferences --> java --> Installed JREs to edit your default java source package, and add the protobuf-java-2.4.1.jar file mentioned above.

The above content is excerpted and the network. It is verified that Protocol Buffer (Language Specification) can be correctly installed)

The following articles excerpt http://www.cnblogs.com/stephen-liu74/archive/2013/01/02/2841485.html

I. Advantages of Protobuf

Protobuf is like XML, but it is smaller, faster, and simpler. You can define your own data structure, and then use the code generated by the Code Generator to read and write this data structure. You can even update the data structure without re-deploying the program. You can easily read and write your structured data in different languages or from different data streams by using Protobuf to describe the data structure once.

It has a very good feature, that is, the "backward" compatibility is good, and people do not have to destroy the deployed programs that rely on the "old" data format to upgrade the data structure. In this way, your program does not have to worry about large-scale code refactoring or migration problems caused by changes in the message structure. Because adding a field to a new message does not cause any changes to the released program.

Protobuf has clearer semantics and does not require anything similar to the XML Parser (because Protobuf compiler will compile the. proto file to generate the corresponding data handler class to serialize and deserialize Protobuf data ).

Protobuf does not need to learn complex document object models. Protobuf's programming mode is friendly and easy to learn. It also has good documents and examples. For people who like simple things, protobuf is more attractive than other technologies.

2. Define the First Protocol Buffer message.
Create a file with the extension. proto, for example, MyMessage. proto, and save the following content to the file.
Message LogonReqMessage {
Required int64 acctID = 1;
Required string passwd = 2;
}
Here we will provide a key description of the above message definition.
1. message is the keyword defined by the message. It is equivalent to struct/class in C ++ or class in Java.
2. LogonReqMessage is the message name, which is equivalent to the struct name or class name.
3. The required prefix indicates that this field is a required field. This field must have been assigned a value before serialization and deserialization. At the same time, there are two other similar keywords in Protocol Buffer, optional and repeated. The message fields with these two delimiters do not have such restrictions as the required field. Compared with optional, repeated is mainly used to represent array fields. The specific usage methods are listed in subsequent use cases.
4. int64 and string represent message fields of Long Integer type and string type respectively. There is a type comparison table in Protocol Buffer, that is, data types in Protocol Buffer and other programming languages (C ++/Java) comparison of the types used in. This table also shows which data type is more efficient in different data scenarios. This table is provided later.
5. acctID and passwd indicate the Message Field names, which are equivalent to the domain variable names in Java or the member variable names in C ++.
6. Tag Number1And2The layout position of different fields in the serialized binary data. In this example, the data encoded by the passwd field must be after the acctID. Note that the value cannot be repeated in the same message. In addition, for Protocol Buffer, fields with a tag value of 1 to 15 can be optimized during encoding, that is, the tag value and type information only occupy one byte, the label ranges from 16 to 2047 and occupies two bytes. The number of fields supported by Protocol Buffer is reduced by one by the 29 power of 2. In view of this, when designing the message structure, we can try to make the repeated Type field label between 1 and 15, which can effectively save the number of bytes After encoding.

3. Define the Second Protocol Buffer message (containing the enumeration field.
// When defining the message of Protocol Buffer, you can add comments in the same way as the C ++/Java code.
Enum UserStatus {
OFFLINE = 0; // indicates the OFFLINE user.
ONLINE = 1; // indicates the user in the ONLINE status.
}
Message UserInfo {
Required int64 acctID = 1;
Required string name = 2;
Required UserStatus status = 3;
}
Here we will provide a key description of the above message definition (including only those not described in the previous section ).
1. enum is the keyword defined by Enumeration type, which is equivalent to enum in C ++/Java.
2. UserStatus is the enumeration name.
3. Unlike the enumeration in C ++/Java, the delimiter between enumeration values is a semicolon rather than a comma.
4. OFFLINE/ONLINE is the enumerated value.
5. 0 and 1 indicate the actual integer value corresponding to the enumerated value. Like C/C ++, you can specify any integer value for the enumerated value instead of always starting from 0. For example:
Enum OperationCode {
LOGON_REQ_CODE = 101;
LOGOUT_REQ_CODE = 102;
RETRIEVE_BUDDIES_REQ_CODE = 103;

Logon_resp_codes = 1001;
LOGOUT_RESP_CODE = 1002;
Retrieve_budies_resp_code = 1003;
}

4. Define the Third Protocol Buffer message (containing nested message fields.
You can define multiple messages in the same. proto file, so that you can easily implement the definition of nested messages. For example:
Enum UserStatus {
OFFLINE = 0;
ONLINE = 1;
}
Message UserInfo {
Required int64 acctID = 1;
Required string name = 2;
Required UserStatus status = 3;
}
Message LogonRespMessage {
Required LoginResult logonResult = 1;
Required UserInfo userInfo = 2;
}
Here we will provide a key description of the above message definition (including only those not described in the previous two sections ).
1. The definition of LogonRespMessage contains another message type as its field, such as UserInfo userInfo.
2. In the above example, UserInfo and LogonRespMessage are defined in the same. proto file. Can we include the message defined in other. proto files? Protocol Buffer provides another keyword import, so that we can define many common messages in the same. in the proto file, while other message definition files can include the messages defined in this file by means of import, such:
Import"Myproject/CommonMessages. proto"

5. Basic Rules of the qualifier (required/optional/repeated.
1. Each message must contain at least one required field.
2. Each message can contain 0 or more optional fields.
3. The repeated field can contain 0 or more data. It should be noted that this is different from the arrays in C ++/Java, because the arrays in the latter two must contain at least one element.
4. If you want to add a new field to the original message protocol and ensure that the program of the old version can read or write normally, the newly added field must be optional or repeated. The principle is very simple. In earlier versions, the program cannot read or write new required qualifier fields.

Vi. Type table.

. Proto Type	Notes	C ++ Type	Java Type
Double		Double	Double
Float		Float	Float
Int32	Uses variable-length encoding. Inefficient for encoding negative numbers-if your field is likely to have negative values, use sint32 instead.	Int32	Int
Int64	Uses variable-length encoding. Inefficient for encoding negative numbers-if your field is likely to have negative values, use sint64 instead.	Int64	Long
Uint32	Uses variable-length encoding.	Uint32	Int
Uint64	Uses variable-length encoding.	Uint64	Long
Sint32	Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s.	Int32	Int
Sint64	Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s.	Int64	Long
Fixed32	Always four bytes. More efficient than uint32 if values are often greater than 228.	Uint32	Int
Fixed64	Always eight bytes. More efficient than uint64 if values are often greater than 256.	Uint64	Long
Sfixed32	Always four bytes.	Int32	Int
Sfixed64	Always eight bytes.	Int64	Long
Bool		Bool	Boolean
String	A string must always contain UTF-8 encoded or 7-bit ASCII text.	String	String
Bytes	May contain any arbitrary sequence of bytes.	String	ByteString

VII. Protocol Buffer message upgrade principles.
In actual development, there will be such an application scenario, that is, the message format has to be upgraded due to changes in certain requirements, however, some applications that use the original message format cannot be upgraded immediately, which requires us to follow certain rules when upgrading the Message format, this ensures that the new and old programs based on the new and old message formats run simultaneously. The rules are as follows:
1. Do not modify the tag number of an existing field.
2. Any newly added field must be the optional and repeated delimiters. Otherwise, the message compatibility between the new and old programs cannot be guaranteed when messages are transmitted to each other.
3. in the original message, existing required fields cannot be removed. optional and repeated fields can be removed, but their previous tag numbers must be retained, new fields cannot be reused.
4. int32, uint32, int64, uint64, bool, and Other types are compatible, sint32 and sint64 are compatible, string and bytes are compatible, fixed32 and sfixed32, and fixed64 and sfixed64 are compatible. This means that if you want to modify the type of the original field, to ensure compatibility, you can only change it to a Type compatible with its original type, otherwise, the compatibility of the New and Old message formats will be broken.
5. The optional and repeated delimiters are also compatible with each other.

8. Packages.
We can define the package name in the. proto file, for example:
PackageOurproject. lyphone;
When the package name generates the corresponding C ++ file, it will be replaced with the namespace name, that is, namespace ourproject {namespace lyphone. The generated Java code file will become the package name.

9. Options.
Protocol Buffer allows us to define some common options in the. proto file, which can instruct the Protocol Buffer compiler to help us generate more Matching target language code. The built-in options of Protocol Buffer are divided into the following three levels:
1. File level. This option affects all messages and enumeration defined in the current file.
2. Message level. This option only affects a message and all its fields.
3. field level. This option only responds to fields related to it.
The following describes some common Protocol Buffer options.
1. option java_package = "com. companyname. projectname ";
Java_packageIs a file-level option. By specifying this option, you can make the Java code package named this option value. In the preceding example, the Java code package named com. companyname. projectname. At the same time, the generated Java file will be automatically stored in the com/companyname/projectname subdirectory under the specified output directory. If this option is not specified, the Java package name is the name specified by the package keyword. This option has no effect on generating C ++ code.
2. option java_outer_classname = "LYPhoneMessage ";
Java_outer_classnameIs a file-level option, the main function is to display the name of the external class that generates Java code. If this option is not specified, the external Class Name of the Java code is the file name part of the current file, and the file name must be converted to the camper format, such as my_project.proto, the default external class name of the file will be MyProject. This option has no effect on generating C ++ code.
Note: Java requires that only one java external class or external interface be included in the same. Java file, but C ++ does not. Therefore, messages defined in the. proto file are internal classes of the specified external class, so that these messages can be generated to the same Java file. In actual use, to avoid entering this external class qualifier, you can introduce this external class to the current Java file statically, for example:Import static com. company. project. LYPhoneMessage .*.
3. option optimize_for = LITE_RUNTIME;
Optimize_forIs a file-level option. Protocol Buffer defines three optimization levels: SPEED/CODE_SIZE/LITE_RUNTIME. The default value is SPEED.
SPEED: indicates that the generated code is highly efficient, but the generated code occupies more space after compilation.
CODE_SIZE: opposite to SPEED, code execution efficiency is low, but the generated code requires less space after compilation. It is usually used on platforms with limited resources, such as Mobile.
LITE_RUNTIME: The generated code execution efficiency is high, and the space occupied by the generated code after compilation is very small. This is at the cost of the reflection function provided by Protocol Buffer. Therefore, when we connect to the Protocol Buffer library in C ++, we only need to link libprotobuf-lite, not libprotobuf. In Java, you only need to include the protobuf-java-2.4.1-lite.jar, not the protobuf-java-2.4.1.jar.
NOTE: For the LITE_MESSAGE option, the generated code is inherited from MessageLite, not Message.
4 .[Pack= True]: for historical reasons, numeric repeated fields, such as int32 and int64, are not well optimized during encoding. However, in the latest version of Protocol Buffer, you can add the [pack = true] field option to notify Protocol Buffer to be more efficient in encoding message objects of this type. For example:
Repeated int32 samples = 4 [packed = true].
Note: This option only applies2.3.0The preceding Protocol Buffer.
5 .[Default= Default_value]: optional type field. If this field is not set during serialization, or the field does not exist in messages of earlier versions, the message of this type is deserialized, the optional field is assigned a type-related default value. For example, if bool is set to false, int32 is set to 0. Protocol Buffer also supports custom default values, such:
Optional int32 result_per_page = 3 [default = 10].

10. command line compilation tool.
Protoc-- Proto_path = IMPORT_PATH -- cpp_out = DST_DIR -- java_out = DST_DIR -- python_out = DST_DIR path/to/file. proto
Here we will explain the parameters of the above command.
1. protoc is the command line compilation tool provided by Protocol Buffer.
2. -- proto_path is equivalent to the-I option. It is mainly used to specify the directory of the. proto message definition file to be compiled. This option can be specified at the same time.
3. the -- cpp_out option indicates that the C ++ code is generated, the -- java_out option indicates that the Java code is generated, and the -- python_out option indicates that the Python code is generated. The subsequent directory is the directory where the generated code is stored.
4. path/to/file. proto indicates the message definition file to be compiled.
Note: For C ++, the Protocol Buffer compilation tool can generate a pair of. h and. cc C ++ code files for each. proto file. The generated file can be directly loaded into the project where the application is located. For example, the files generated by MyMessage. proto are MyMessage. pb. h and MyMessage. pb. cc.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More