Original: http://blog.csdn.net/guxch/article/details/12157151
------------------------------------------------------------------------------------
I. Overview Thrift is a subproject under Apache, the first Facebook project, and later Facebook offered Apache as an open source project, and on the official web, thrift was described as "scalable cross-language Services Implementation ", said the popular some, thrift has the following characteristics:
It has its own cross machine communication framework and provides a set of libraries. It is a code generator that, according to its rules, can generate communication process code for multiple programming languages. In general, the Cross machine communication framework is across the software platform (linux,windows), and the most special thing about thrift is that it is cross-language: for example, you can use almost all popular languages (C + +, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C #, Cocoa, JavaScript, etc.) to implement the communication process, the advantage is that you do not have to worry about programming language, if the server and client need to write, choose your best or project language, you can generate a communication framework If you write a server-side program that defines a good communication rule (a. thrift file in thrift), the server-side implementation language you use will not affect the client, and later users can implement the client in other programming languages. This is a wonderful thing.
Similar to the Thrift Open source project is Google's protocol Buffer (PROTOBUF), PROTOBUF currently provides C + +, Java, Python three languages API, simpler than thrift, the application is not as wide as thrift , there are comments that Protobuf write complex application is more difficult.
The current version of Thrift is 0.9.1, and the following discussion is based on this version, and the code language is based on C + +.
second, thrift application scene
Thrift actually should be divided into three parts, one is called Thrift Code generator, one is called the Thrift Application Framework (library), the last one is the code that it generates. The basic process of thrift application is shown in the following figure.
From the above diagram, to generate a thrift application, the following files are required:
A. Thrift file: This file is the definition of the communication interface, the most important is the format of the information flow. Programming language: This need not be explained. Thrift code Generator (Thrift compiler, translated into code generators seems more appropriate): This thing is generated during the installation of Thrift, which produces several code that conforms to the format of your agreed communication. Thrift Application Framework Library: This thing is also generated during the installation process. Other Third-party Support library: for C + +, the most important is boost.thread, libevent,log4cxx, etc., in accordance with the mode of operation, the generated code may need to call these libraries.
Third, install under Linux
Thrift installation includes the above mentioned build code generator and application framework library, Web page (http://thrift.apache.org/docs/install/) Describes installation dependencies, in addition to GCC and its compiler tool itself, The biggest dependency of compiling thrift is boost. The installation process is not complicated, please refer to the relevant online article. Iv. use of thrift under Windows
The Windows environment is discussed separately because the previous thrift version (before 0.8) is not supported by Windows, although some people do patches, but looking at their documentation is rather cumbersome. 0.8 is starting to support windows, and the current official document still needs to Cygwin such things in the description. In fact, 0.9.1 has been able to support Windows very well.
Thrift compiler compilation: Under \compiler\cpp There is a Compiler.sln VS2010 solution, which has a VC project called compiler. Unfortunately, to compile compiler, must flex and bison support, this can be downloaded to HTTP://SOURCEFORGE.NET/PROJECTS/WINFLEXBISON/?SOURCE=DLP, in the VC properties of the project, modify " Generate event-> the command behavior in pre-build events (note the paths of Win_flex and Win_bison): [plain] view plain copy win_flex-o "src\\thriftl.cc" src/ Thriftl.ll win_bison-y-O "src\thrifty.cc"--defines= "src/thrifty.hh" src/thrifty.yy then inttypes.h (download online) and Thrifty.h (in Superior directory to the SRC directory, compiled. The steps above can also be done by hand, which is more insurance (refer to Compiler\cpp\readme_windows.txt, but with a little error).
Compilation of thrift Libraries: in the "\lib\cpp" directory, There is a VS2010 solution file called Thrift.sln that contains two VC projects: LIBTHRIFTNB and Libthrift,libthrift depend on the BOOST,LIBTHRIFTNB depending on boost and libevent, after the correct reference library is set up (first Build boost and Libevent), you can compile the two projects, get two DLLs, that is, the Thrift Application Framework Library, in thrift applications, you need to use this framework library.
v. Basic concepts and applications of thrift
This part of the article has been covered, this article is only from the thrift White Paper point of view Add some personal understanding and annotation. The 1.Thrift has the following several concepts:
type System (Typesystem)
Thrift defines a data transmission description language (somewhat like IDL), which is "language neutral" and this is its type system. It is divided into five types (data type expression 3 kinds, predefined class/structure 1, Interface expression 1 kinds):
The basic type (Basictype), which is bool, Byte, i16, I32, i64, double, string, has these basic types in any language, and more interestingly, string, which expresses text and also binary bytes. Another feature is that the integral type is not unsigned, because some languages do not support it. struct type (struct): is the struct in C language, combining the basic types. Container type (container): Is the collection type (List/set/map), where the element is any thrift recognizable base, struct, container type. "I do not know if there is a language that does not support list/set/map, so thrift how to deal with it." "Exception type" (Exception): From the data structure is the structural type, can be considered to be convenient for exception processing, the predefined, special meaning of the structure type. Services definition Type (Service): This type is actually used to define the interface, and the Thrift Code generator generates the code framework based on this definition.
Transmission (transport)
That is, the transmission channel of the information and the way of reading and writing, for example, the media can be socket, shared memory or File,thrift specify some basic operations (Open/close/isopen/read/write/flush, to the server, Plus listen/accept). Special, for the socket way, there are Tscoket class, the file way, there are Tfiletransport class, the above class comparison of the bottom, there are several practical classes: Tbufferedtransport,tframedtransport, Tmemorybuffer and so on.
Protocol (Protocol)
Is the encapsulation of the transport protocol, that is, the transmission using binary, XML or text to represent information, its function has two: 1. bidirectional Message Queuing; 2. Encode and decode information (i.e. read/write to the above type). With regard to streaming format, the thrift data type is self segmented, meaning that the thrift will insert the symbol itself in the partition of the data field, and in decoding, even without the definition of the data field, thrift can successfully separate the data fields. In several articles, it is mentioned that thrift binary stream coding is quite efficient (can be combined with compression), so the preferred protocol should be the binary protocol.
version (versioning)
If a program is developed separately, the version problem is the problem of not being around. The thrift version is implemented through "field identifiers", each of which is identified by its identity, and each field in the structure has its identity, which uniquely determines a data field. When decoding, the data field's identity is checked, and if it is not recognized, the data field is discarded. Thrift can also use the "Isset" mechanism to determine whether certain fields are set (the sender is used to indicate if it is set, and the receiving end is used to detect whether it is set).
Four cases: Add the data domain, the old client, the new server side: The data sent by the client does not have the domain, the server can detect, can be processed by default value. Deleted the data domain, the old client, the new server side: The data sent by the client has the domain, the server side ignores the domain. Added data domain, new client, old server side: The data sent by the client has the domain, the server side ignores the domain. Deleted data domain, new client, old server side: The data sent by the client does not have the domain, the server side may not know how to handle this situation.
Processor (processor)
is how to reconcile the parts to form the code (or the framework of the user code). It has two important classes: Tprocessor and Tserver. Tprocessor is used to implement RPC calls, Tserver is the base class for all server classes, and the Tserver class handles connections and threads primarily, regardless of transmission, encoding, and so on. One of the main concerns of user code is. thrift file, and the second is this interface. Thrift implements classes such as Tsimpleserver (single-threaded), tthreadedserver (one thread per connection), and tthreadpoolserver (thread pool).
The following figure is the basic structure (C + +) of the thrift generated code.
In the diagram, the Serviceif is a virtual interface class generated from the interface file (. thrift), and the user's concrete implementation is in Servicehandler. Various invocation methods are implemented in Tserver. "Detailed description See example"
Several considerations on the implementation of 2.Thrift
Target Language
Although there are a number of options, the most common (and possibly best) is C + +, Java, and Python.
the generated struct body
Data domain members are public, there is no set,get and so on, although it is recommended to use Isset, but you can not, the system is strong enough to handle such problems as "fieldnotsetexception", and therefore does not involve the exception. The read and write aspects are also public, so that users can use them outside of the intrinsic RPC.
RPC method Identity: When RPC is implemented, the mapping between function name and function pointer is established, roughly as follows (different language expressions are different, C++,map):
std::map<std::string, function pointers > processmap_;
This speeds up function calls.
Multithreading
For C + + implementation, during the development process, thrift developers have studied Boost,ace's thread,timer-related things, and developers do not want to introduce too much third-party dependencies, so only references to boost::shared_ptr are necessary in thrift, But in order to cross the platform or get more functionality, in general, boost Thread,timer and its dependent libraries are also needed.
Threadmanager and Timermanager
The thread management class is used to manage the thread pool, and the timer management class can trigger runnable objects at timed intervals, opening one thing (which can be put or not on a separate thread).
nonblockingoperation
This thing needs libevent support.
Compiler (code generator)
This thing is written in C + + and relies on LEX/YACC. Code generation is divided into two steps: First, check the included file and type definition file, generate the "Parse tree" (The parse), and second, put each type into the parse tree and generate code based on the parse tree.
Tfiletransport
This class (and its inheriting classes) can log the request message into a file, which, for performance, caches the record and stores the disk. Record files are blocks (file size), padding, and records cannot span blocks.
(not finished)