Introduction to the one or three main open source persistence framework
1.1 Google Protocol Buffer
Protocol buffer is a data interchange format for Google, which is independent of the language and is independent of the platform. Google offers implementations in three languages: Java, C + +, and Python, each of which contains compilers for the language and library files. Because it is a binary format, it is much faster to exchange data than to use XML. It can be used in data communication between distributed applications or in heterogeneous environments. As an excellent efficiency and compatibility of binary data transmission format, can be used for such as network transmission, configuration files, data storage and many other fields. 1.2 Apache Thrift
Thrift is a software framework for the development of extensible and Cross-language services. It combines powerful software stacks and code generation engines to build in C + +, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C #, Cocoa, JavaScript, Node.js, Smalltalk, and OCaml Seamless, efficient service between these programming languages.
Thrift allows you to define data types and service interfaces in a simple definition file. As an input file, compiler generated code is used to easily generate seamless Cross-language programming language for RPC client and server communication. 1.3 Apache Avro
Avro is a data serialization system that can provide rich data structure types, fast compressible binary data forms, store persistent data in file containers, remote procedure call RPC.
Avro depends on the schema (schema). Avro data reads and writes frequently, and these operations require patterns, which reduces the cost of writing to each data source, making serialization fast and lightweight. The self description of this data and its schema facilitates the use of dynamic scripting languages. When Avro data is stored in a file, its schema is also stored so that any program can process the file. This is also easy to solve if you need to read data in a different pattern, because all two modes are known. The Avro pattern is defined in JSON, which is easy to implement for languages that already have JSON libraries. 1.4 Comparison and analysis of three major persistence frameworks
The three frameworks have their own advantages in their respective fields. In the case of persistence alone, Avro does not need to be compiled to be the most flexible, protocol buffer using variable-length binary format storage should be the highest efficiency.
Because the current work is primarily for storage and the object is small and simple, the plan is to use Google protocol buffer for the implementation of persistent processing. Second, Google Protocol buffer use guidance
This section mainly refers to Google official guidance document, link address https://developers.google.com/protocol-buffers/docs/proto. Google Protocol Buffer currently directs the three development languages of C + +, Java, and Python. 2.1 Protocol Buffer development process
Protocol buffer development needs to first define a. proto file. This file defines the various classes (interfaces) and their relationships in detail. Compiling the. proto file with the Protoc tool generates the classes that correspond to the specified language, and then you can use these classes in your own programs for persistence and anti-persistent processing. 2.2 definition. proto file 2.2.1 defines a message type
Here is an example of a message to illustrate.
Message SearchRequest {
Required String query = 1;
Optional Int32 page_number = 2;
Optional Int32 result_per_page = 3 [default = 10];
Enum Corpus {
Universal = 0;
WEB = 1;
IMAGES = 2;
local = 3;
NEWS = 4;
Products = 5;
Video = 6;
}
Optional Corpus Corpus = 4 [default = Universal];
}
2.2.2-supported modifier keywords
Required: must appear in persistent stream
Options: The persisted stream may not appear or appear at most once. You can set a default value.
Repeated: can occur 0 or more times in a persistent stream. For integers, you can set compression. object types supported by 2.2.3
Message: Defining messages
Enum: Defining enumerations
value types supported by 2.2.4
Here is a description of the type used in the. proto file and the corresponding relationship between the compiled development language type.
. Proto Type |
Notes |
C + + Type |
Java Type |
Double |
|
Double |
Double |
Float |
|
Float |
Float |
Int32 |
Uses variable-length encoding. Inefficient for encoding negative numbers–if your The field is likely to have the values, use negative sint32. |
Int32 |
Int |
Int64 |
Uses variable-length encoding. Inefficient for encoding negative numbers–if your The field is likely to have the values, use negative sint64. |
Int64 |
Long |
UInt32 |
Uses variable-length encoding. |
UInt32 |
INT[1] |
UInt64 |
Uses variable-length encoding. |
UInt64 |
LONG[1] |
Sint32 |