RPC overview-Pb, thrift, Avro

Source: Internet
Author: User
Document directory
  • Size comparison
  • Runtime Performance

Comparison between Apache Avro and thrift,Http://www.tbdata.org/archives/1307

Thrift vs. Protocol buffers,Http://stuartsierra.com/2008/07/10/thrift-vs-protocol-buffers


Thrift vs protocol buffers vs Avro-biased comparison-slideshare

Schema evolution in Avro, protocol buffers and thrift


Protocol Buffers




RPC Problems

Simply looking at this problem is the problem of serialization and deserialization.

The more complex problem is the RPC problem. In cross-platform and cross-language scenarios, inter-module interaction and calling (transparent interaction between multiple programming languages)

This is a long-standing problem ......
1. serialization problems, how to convert class objects or other data into common formats for transmission, such as binary, text, XML
2. Data Type issues, differences in data types in different languages
3. Differences in method calls in different languages

A simple idea is to implement the serialization module on the sending end, convert the object into binary data, implement the deserialization module on the receiving end, parse the binary data, and restore the data to an object.
For data type problems, implement matching between different languages, such as C ++ objects, Java objects, and C structures...
For RPC, we also need to solve the differences between method calls. For example, when RPC is called in Java, the server is C ++.
In addition, different sending and receiving codes must be implemented for different RPC calls ......
Users are quite complicated to use...

Of course, the IDL proposed by CORBA can partially solve this problem. First, it abstracts the common parts of all languages and defines the abstract interface description language.
You only need to use IDL to describe the data type to be transmitted and the interface to be called. The CORBA engine is used to complete the conversion of other languages.
Because of its huge and complex nature, CORBA has been stuck in the academic stage...

Then there is a new idea, Web Service
Different serialization and deserialization modules are not required, but a common, machine-understandable text language, XML. Soup protocol...
This idea solves this problem from another aspect.
Later, restful service-based programming based on the HTTP protocol was similar, but the operation types were simplified from different angles...

Of course, when the big data era comes, we find that the transmission efficiency of text protocols based on XML or even JSON is very high.
Therefore, Google and Facebook began to study Binary-based RPC solutions, so Pb, thrift, and Avro were generated. In fact, the essence and theory are also derived from CORBA.


The following lists the problems of various previous solutions,

• Soap

XML, XML and more XML. Do we really need to parse so much XML?


Amazing idea, horrible execution

Overdesigned and heavyweight


Embraced mainly in Windows client software


Okay, proven-Hurray!

But lack protocol description.

You have to maintain both client and server code.

You still have to write your own wrapper to the Protocol.

XML has high parsing overhead.

(Relatively) expensive to process; large due to repeated tags


Thrift vs protocol buffers vs Avro

First, these three solutions are common, that is, they can solve the problems brought about by the above solutions.

  • Interface Description (IDL), Using IDL and supporting code generation
  • Performance, High Efficiency
  • Versioning, which supports the evolution of different versions and Schemas
  • Binary format, which uses binary as the transmission format

For the binary encoding protocol of the three solutions and how to cope with schema evolution, refer to the following blog

Schema evolution in Avro, protocol buffers and thrift


Thrift vs protocol BuffersOverall comparison

Overall, I think thrift wins on features and Protocol buffers win on documentation. Implementation-wise, they're quite similar.
The major difference is that thrift provides a full client/server RPC implementation, whereas protocol buffers only generate stubs to use in your own RPC system.

Classic ratings, the two are very similar. Thrift wins in functions, while Pb wins in documents...
The function must be greater than the document, so thrift uses more people...


Data Type comparison

Obviously, thrift supports more types, especially the direct support for container, which is very powerful.


IDL comparison

Actually, they are similar and different.

The field ID is in different forms. Thrift starts with 1:, while Pb ends with 1.

For container support, thrift uses list directly, while Pb can only be represented by repeated.


Performancesize comparison

It can be seen that thrift uses tcompactprotocol and Pb equivalent

Schema evolution in Avro, protocol buffers and thrift, refer to this, the encoding of the two is indeed very similar




Runtime Performance


Thrift vs Avro

Http://www.tbdata.org/archives/1307, refer to this comparison of Ali, more comprehensive

The biggest feature of Avro is the dynamic schema. After schema changes, you do not need to recompile the client and server code.
Combined with hadoop

The problem is that the use is complicated and harder to use.


  • Thrift is suitable for static data exchange between programs. It requires that the schema be predictable and relatively fixed.
  • Avro adds schema dynamic support based on thrift and does not play a role in thrift performance.
  • The Avro explicit schema design makes it more suitable for building common tools and platforms for data exchange and storage, especially in the background.
  • At present, thrift has the advantage of more language support and relative maturity.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.