The following content is from all over the internet simply sorted, because the previous period of time I have been studying thrift, the research of a little information to share to the needs of the peers!
First section RPC technology and implementation introduction
First, consider the RPC (Remote Procedure call) problem in distributed systems, a complete RPC module needs can be divided into three levels
Services Layer (Service): RPC interface Definition and implementation
Protocol layer (Protocol): RPC message format and data encoding format
Transport Layer (transport): Implement low-level communications (such as sockets) and system-related functions (such as event loops, multithreading)
In the actual large-scale distributed system, different services are often used in different languages to achieve, so the general RPC system will provide a Cross-language process call function, such as a section of C + + implementation of the client code can remotely invoke a Java implemented services. There are two ways to implement a cross-language RPC:
Static code generation: Developers use an intermediate language (IDL, Interface Definition language) to define RPC interfaces and data types, and then use a compiler to generate different language code (such as C + +, Java, Python), and the generated code is responsible for RPC protocol layer and Transport layer implementation. For example, the implementation of a service is in C + +, the server needs to generate C + + code to implement the RPC protocol and Transport layer, the service layer uses the generated code to communicate with the client, and if the client uses Python, the client needs to generate Python code.
Based on the "introspection" of the dynamic type system to achieve: protocols and transport layers can be implemented in only one language, but the language needs to be associated with a dynamic type system with an "introspective" or reflective mechanism that provides bindings to other languages externally, and client and server use RPC through language binding. For example, consider using C and GObject to implement an RPC library, and then implement bindings for other languages through GObject.
The advantage of the first approach is that the protocol layer of RPC and the implementation of the transport layer do not need to be bound to a dynamic type system such as GObject. At the same time, it avoids the dynamic type checking and conversion, the program efficiency is high, but its disadvantage is to provide different RPC protocol layer and Transport layer implementation for different languages. The main difficulty of the second method is the implementation of language binding and universal object serialization mechanism, and it also needs to consider the problem of efficiency.
Thrift is an implementation of a Cross-language RPC protocol stack based on static code generation, which can generate code in mainstream languages, including C + +, Java, Python, Ruby, PHP, which implements RPC protocol layer and Transport layer functionality. This allows the user to focus on the invocation and implementation of the service. The Cassandra Service Access protocol is implemented based on Thrift.
Section II Thrift Introduction
Thrift from the famous Facebook hand, in 2007 Facebook submitted the Apache Foundation to thrift as an open source project, For Facebook at the time, the creation of thrift was designed to address the large amount of data transmitted between the systems in the Facebook system, as well as the cross-platform nature of the different locales of the system. So thrift can support a variety of programming languages, such as C + +, C #, Cocoa, Erlang, Haskell, Java, Ocami, Perl, PHP, Python, Ruby, Smalltalk. Communication between many different languages thrift can be used as a binary high-performance communication middleware to support data (object) serialization and multiple types of RPC services. Thrift is applicable to program static data exchange, need to determine his data structure, he is completely static, when the data structure changes, you must edit the IDL file, code generation, and then compile the loading process, compared to other IDL tools can be considered as a thrift weakness, Thrift is suitable for building large data exchange and storage of common tools, for large systems in the internal data transmission relative to JSON and XML, regardless of performance, transmission size has obvious advantages.
The Thrift consists mainly of 5 parts:
Type system and IDL compiler: The interface code that is responsible for generating the corresponding language from the user-given IDL file
Tprotocol: Implement RPC protocol layer, you can choose a variety of different object serialization mode, such as JSON, Binary.
Ttransport: The implementation of RPC Transport layer, the same can choose different transport layer implementation, such as socket, non-blocking socket, memorybuffer and so on.
Tprocessor: As a link between the protocol layer and the Service implementation provided by the user, is responsible for invoking the interface of the service implementation.
Tserver: Aggregation Tprotocol, Ttransport and tprocessor several objects.
These 5 components are implemented in Thrift source code by providing libraries for different languages, which are under the Lib directory of the Thrift source directory and need to familiarize themselves with the interfaces provided by the libraries corresponding to their language before using Thrift.
Section III Using Thrift projects
(1) Thrift for the Quara system backend data communication, the server is implemented in C + +, the client is Python.
Quara background: Quara is an online question-answering service company, similar to Sina Weibo and Baidu, informed sources revealed that last year Quara received 14 million dollars in investment, currently they have only 9 employees.
(2) Thrift for communication and data transmission between clients and Evernote servers developed on a variety of Evernote API platforms, Evernote API defines its own Evernote data Access and Management (Edam) Protocol specification, allowing clients to upload, download files, and online Instant search services with smaller network bandwidth.
Evernote background: Evernote is a very famous free software, its biggest feature is to support multi-platform, and data can be synchronized with each other through the network. For example, you can add notes to the Evernote on your phone at any time, and you can see it on your computer when you go home!
(3) The Thrift:thrift in HBase is used in HBase to provide a Cross-platform service interface that can be used in HBase to start Hbase-root]/bin/hbase Server coverage thrift with the Thrift HBase Start command , the client generates different versions of client code through the Thrift Command, and operates on the remote HBase server based on the defined data format, which is another way to make a rest remote method call.
(4) Other systems: such as Facebook's scribe system, Taobao's Timetunnel system and hive, etc.