The use and principle of big data---PROTOBUF

Source: Internet
Author: User
Tags xml parser

the use and principle of protobuf IntroductionProtobuf is a flexible and efficient protocol for serializing data. Compared to XML and JSON formats, PROTOBUF is smaller, faster, and more convenient. Protobuf is a cross-language, and comes with a compiler (PROTOC), only need to compile it, can be compiled into Java, Python, C + + and other code, and then can be used directly, no need to write other code, from the parsed code. One message data, the size after serialization with Protobuf is one of 10 of JSON, one of 20 in XML format, is one of 10 of binary serialization. installation1, download code, HTTPS://GITHUB.COM/GOOGLE/PROTOBUF 2, install Protobuf
Tar-xzf protobuf-2.1.0. tar.gz cd protobuf./configure--prefix=/usr/local/protobufmakemake checkmakeInstall 

3. Configuration files

1) added in Vim/etc/profile and ~/. Profile: Export path= $PATH:/usr/local/protobuf/bin/ Export pkg_config_path=/usr/ LOCAL/PROTOBUF/LIB/PKGCONFIG/2) Configure the dynamic-link library, vim/etc/ld.so.conf, to add/usr/local/protobuf/lib to the file(note: Add at New line) 3) execution: Ldconfig  

comparison of similar technologies Advantages:1) protobuf compared with XML, the main advantage is high performance.  It is stored in an efficient binary way, 3 to 10 times times smaller than XML, and 20 to 100 times times faster. 2) You can customize the data structure and then read and write the data structure using the code generated by the code generator. You can even update the data structure without having to redeploy the program.  Simply use Protobuf to describe your data structure once, and you can easily read and write your structured data from a variety of different languages or from a variety of different streams of data. 3) "Backwards" compatibility is good, users do not have to break the deployed, relying on the "old" data format of the program can be upgraded to the data structure. This allows the program to avoid large-scale code refactoring or migration problems caused by changes in the structure of the message.  Because adding a field in a new message does not cause any changes to the program that has already been published. 4) Protobuf semantics are clearer, without the need for something like an XML parser.  The PROTOBUF compiler generates a corresponding data access class to compile the. proto file to serialize and deserialize the PROTOBUF data. 5) using Protobuf without having to learn a complex document Object model, PROTOBUF's programming model is friendly, easy to learn, and it has good documentation and examples, and protobuf is more appealing to people who like simple things than other technologies. Insufficient:1) Protbuf also has shortcomings compared to XML.  It is simple in function and cannot be used to represent complex concepts.  2) XML has become a multi-industry standard authoring tool, Protobuf is just Google's internal use of tools, in terms of versatility is much worse.  3) because text is not suitable for describing data structures, PROTOBUF is also not suitable for modeling text-based markup documents such as HTML. 4) because XML has some degree of self-explanatory, it can be read directly by the editor, at this point protobuf not, it is stored in binary way, unless you have a. Proto definition, otherwise you can't read anything protobuf directly. Example ComparisonPROTOBUF and XML Deposit data:
// modeling person's name and email fields in XML:<person>    <name>john doe</name>    <email>[ text representation ofemail protected]</email></person>//Protocolbuffer: Person{    "  John Doe"" [Email protected] "}     

Read data:

//Operation Protocolbuffer is also very simple:cout <<"Name:"<< Person.name () <<Endl;cout<<"e-mail:"<< Person.email () <<Endl;//and the XML you need:cout <<"Name:" << Person.getelementsbytagname ("name")->item (0),InnerText ()<<Endl;cout<<"e-mail:" << Person.getelementsbytagname ("Email")->item (0),InnerText ()<< end;

Usage Scenarios1, need and other systems to do message exchange, the message size is very sensitive, then protobuf suitable, it is not language-free, the message space relative to XML and JSON save a lot. 2, small data occasions.  If you are big data, use it doesn't fit. 3, the project language is C++,java,python, because they can use Google's source class library, serialization and deserialization efficiency is very high. Other languages need to be written by third parties or themselves, and the efficiency of serialization and deserialization is not guaranteed. program Examples (c + +)The approximate function of the program example is to define a persion struct and a addressbook that holds the persion, and then a writer writes the struct information to a file, and another program reads the information from the file and prints it to the output. 1. address.proto file
Package tutorial;message persion {    string1;     2 ;} Message addressbook {    1;}
Compile the. proto file and execute the command:protoc-i= $SRC _dir--cpp_out= $DST _dir $SRC _dir/addressbook.proto, the command Protoc--cpp_out=/tmp Addressbook.proto is executed in the example, and the file Addressbook.pb.h and addressbook.pb.cc are generated in/tmp. 2, Write.cpp file, write to the file addressbook information, the file is binary
#include <iostream>#include<fstream>#include<string>#include"Addressbook.pb.h"using namespacestd;voidPromptforaddress (Tutorial::P ersion *persion) {cout<<"Enter persion Name:"<<Endl; stringname; CIN>>name; Persion-set_name (name); intAge ; CIN>>Age ; Persion-Set_age (age);}intMainintargcChar**argv) {    //google_protobuf_verify_version;    if(ARGC! =2) {Cerr<<"Usage:"<< argv[0] <<"Address_bool_file"<<Endl; return-1;    } Tutorial::addressbook Address_book; {fstream input (argv[1], iOS::inch|ios::binary); if(!input) {cout<< argv[1] <<": File not found. Creating a new file."<<Endl; }        Else if(!address_book. Parsefromistream (&input)) {Cerr<<"Filed to the parse address Book."<<Endl; return-1; }    }    //Add an addresspromptforaddress (Address_book.add_persion ()); {fstream output (argv[1], iOS:: out| Ios::trunc |ios::binary); if(!address_book. Serializetoostream (&output)) {Cerr<<"Failed to write the address book."<<Endl; return-1; }    }    //Optional:delete All global objects allocated by LIBPROTOBUF. //Google::p rotobuf::shutdownprotobuflibrary ();    return 0;}
Compile the Write.cpp file and execute the command:g++ addressbook.pb.cc write.cpp-o write ' pkg-config--cflags --libs protobuf ', notice that the ' symbol here is on the left of the keypad number 1 key, which is the same key as the ~. 3, Read.cpp file, read the AddressBook information from the file and print
#include <iostream>#include<fstream>#include<string>#include"Addressbook.pb.h"using namespacestd;voidListpeople (Consttutorial::addressbook&Address_book) {     for(inti =0; I < address_book.persion_size (); i++) {        ConstTutorial::P ersion& persion =address_book.persion (i); cout<< Persion.name () <<" "<< persion.age () <<Endl; }}intMainintargcChar**argv) {    //google_protobuf_verify_version;    if(ARGC! =2) {Cerr<<"Usage:"<< argv[0] <<"Address_bool_file"<<Endl; return-1;    } Tutorial::addressbook Address_book; {fstream input (argv[1], iOS::inch|ios::binary); if(!address_book. Parsefromistream (&input)) {Cerr<<"Filed to the parse address Book."<<Endl; return-1;    } input.close ();    } listpeople (Address_book); //Optional:delete All global objects allocated by LIBPROTOBUF. //Google::p rotobuf::shutdownprotobuflibrary ();    return 0;}
Compile the Read.cpp file,g++ addressbook.pb.cc read.cpp-o read ' pkg-config--cflags--libs protobuf '4. Execution procedure result ref:http://www.cnblogs.com/luoxn28/p/5303517.htmlhttp://www.ibm.com/developerworks/cn/linux/l-cn-gpb/ Index.html#resources

The use and principle of big data---PROTOBUF

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.