Introduction to Apache thrift, compared to other RPC

Source: Internet
Author: User
Tags serialization

What Thrift is.
Thrift from the famous Facebook hand, in 2007 Facebook submitted the Apache Foundation to thrift as an open source project, For Facebook at the time, the creation of thrift was designed to address the large amount of data transmitted between the systems in the Facebook system, as well as the cross-platform nature of the different locales of the system. So thrift can support a variety of programming languages, such as C + +, C #, Cocoa, Erlang, Haskell, Java, Ocami, Perl, PHP, Python, Ruby, Smalltalk. Communication between many different languages thrift can be used as a binary high-performance communication middleware to support data (object) serialization and multiple types of RPC services. Thrift is applicable to program static data exchange, need to determine his data structure, he is completely static, when the data structure changes, you must edit the IDL file, code generation, and then compile the loading process, compared to other IDL tools can be considered as a thrift weakness, Thrift is suitable for building large data exchange and storage of common tools, for large systems in the internal data transmission relative to JSON and XML, regardless of performance, transmission size has obvious advantages.

Thrift is a concrete implementation of the IDL (interface Definition Language) descriptive language, and the topic of IDL can be traced back to CORBA's popularity 1999-2001 (Common Object Request Broker Architecture/Common Object Request Broker architecture, in IDL we don't seem to forget these keywords: module, interface, String, long, and int, and I remember that IDL used module to create namespaces, and accurately mapped to Java package, these features are almost identical to the characteristics of the thrift now, so Thrift's design ideas and ideas are nothing new idea from Mars, to see the concepts that people put forward in the era of CORBA's popularity, as shown in the illustration of CORBA The various parts of the request, we will then compare with the thrift:


Thrift Infrastructure
Thrift is a server and client architecture system, from my personal senses to see Thrift is a similar xml-rpc+java-to-idl+serialization tools=thrift, Thrift With its own internally defined transport protocol specification (TPROTOCOL) and transport data Standards (TTRANSPORTS), the data structure (struct) and the business Logic (service) for transmitting data through IDL scripts are quickly constructed according to different operating environments. , and simplifies and compresses the transmitted data through its own internal serialization mechanism to improve high concurrency, the cost of data interaction in large systems, the following figure depicts the overall architecture of the thrift, divided into 6 parts: 1. Your business logic implementation (you Code) 2. Service for client and server 3. Perform a read-write operation of the results of the calculation 4. Tprotocol 5.TTransports 6. Underlying I/O communication

The previous 3 sections of the figure are 1. You generate code from the Thrift script file, 2. The brown box in the figure is the code that you build for the client and the processor based on the generated code, 3. The red portion of the figure is the result of the 2-terminal calculation. The 3 sections below the Tprotocol are thrift transport systems and transport protocols as well as low-level I/O communications, thrift and provide blocking, non-blocking, single-threaded, multi-threaded mode running on the server, and can be run with servers/containers, and existing JEE servers/ Seamless combination of web containers.

Data type
* Base Types: Basic type
* STRUCT: Structure type
* Container: Container type, that is, list, Set, Map
* Exception: Exception type
* Service: Defines an object's interface, and a series of methods

Agreement
Thrift allows you to select the type of communication protocol between the client and the server, which is divided into text and binary (binary) transfer protocols in general, for bandwidth saving and transmission efficiency, and typically uses binary type transport protocols for the majority, However, there are times when text-based protocols are used, depending on the actual requirements in the project/product:
* tbinaryprotocol– Binary coded format for data transmission.
* tcompactprotocol– This protocol is very effective in compressing data using the Variable-length Quantity (VLQ) encoding.
* tjsonprotocol– Data encoding protocol using JSON.
* tsimplejsonprotocol– This saving only provides JSON-only protocol, which is useful for parsing in scripting languages
* tdebugprotocol– in the development process to help developers debug, in the form of text to facilitate reading.

Transport Layer
* tsocket-the use of blocking I/O for transmission, is also the most common pattern.
* tframedtransport-is used in non-blocking mode and is transmitted by block size, similar to NiO in Java.
* tfiletransport-, as the name suggests, process transmission in the form of files, although this way does not provide Java implementation, but the implementation is very simple.
* tmemorytransport-uses memory I/O, just like the Bytearrayoutputstream implementation in Java.
* Tzlibtransport-uses perform zlib compression and does not provide Java implementations.

Service-side type
* Tsimpleserver-single-threaded server side uses standard blocking I/O.
* Tthreadpoolserver-Multithreading server side uses standard blocking I/O.
* tnonblockingserver– multi-threaded server side using non-blocking I/O, and implementation of the NIO channel in Java.

Who's using thrift?


Thrift is used to communicate the backend data of the Quara system, the server is implemented in C + +, and the client is Python.
Quara background: Quara is an online question-answering service company, similar to Sina Weibo and Baidu, informed sources revealed that last year Quara received 14 million dollars in investment, currently they have only 9 employees.
Original: Http://www.philwhln.com/quoras-technology-examined#thrift


Thrift for communication and data transfer between clients and Evernote servers developed on a variety of Evernote API platforms, the Evernote API defines its own Evernote data Access and Management (Edam) Protocol specification, allowing clients to upload, download files, and online Instant search services with smaller network bandwidth.
Evernote background: Evernote is a very famous free software, its biggest feature is to support multi-platform, and data can be synchronized with each other through the network. For example, you can add notes to the Evernote on your phone at any time, and you can see it on your computer when you go home.
Original: http://www.evernote.com/about/developer/api/evernote-api.htm

The thrift in the HBase
Thrift is used in HBase to provide cross-platform service interfaces that can be used in hbase to start a thrift server that covers thrift by using the hbase-root]/bin/hbase hbase Start command. Clients that generate different versions of client code through thrift commands, and operate on the remote HBase server based on the defined data format, are another way to make a rest remote method call.
See also: Http://wiki.apache.org/hadoop/Hbase/ThriftApi

For more information please read: Http://wiki.apache.org/thrift/PoweredBy

Comparison of thrift and other transmission modes
XML is much larger than JSON, but the XML tradition is not complicated.
JSON is small, novel, but not perfect.
Thrift small size, the use of more cumbersome, not as easy as the first two, but for 1 high concurrency, 2 data transmission volume, 3, multilingual environment, to meet the 2 points of use of thrift is still worthwhile.

It is assumed that the same content needs to be transferred, but in different ways from 1, the size generated by the transfer of content 2, the cost incurred by the server and the client during the transmission, and these 2 are easy to compare. The resulting content size comparisons using thrift and other methods are as follows:

In the above figure we can see clearly, the most bloated is RMI, followed by XML, the use of Thrift Tcompactprotocol protocol and Google's Protocol buffers difference is not too much, compared to Google's Protocol Buffers effect is the best.

The resulting running cost comparisons using protocols in thrift and other methods are as follows:

In the above figure we can clearly see that most of the resources are REST2 in the agreement, the use of Thrift Tcompactprotocol protocol and Google's Protocol buffers difference is not too much, Thrift's Tcompactprotocol protocol has the best effect.


Turn from: http://www.javabloger.com/article/apache-thrift-architecture.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.