Thrift practice-Thrift RPC service framework log Optimization

Source: Internet
Author: User

 

Preface:
As Facebook's open-source RPC framework, thrift uses the IDL intermediate language and uses the code generation engine to generate RPC framework server/client code in various mainstream languages. however, thrift's implementation is still relatively far from the actual production environment. This series will provide code interpretation and framework expansion for thrift, making it closer to the production environment. this article describes the importance of logs and the introduction of logids in the RPC service framework. logs not only contain a wealth of data (depending on whether it will be mined), but also the best way to track and troubleshoot online service problems.

Log Level
Log4j is the preferred logstore for the RPC service framework. The log levels are as follows:
1) trace the log level with the highest Granularity
2). debug indicates the log level that is helpful for debugging applications.
3). The granularity level of info highlights application running.
4). Warn indicates potential errors
5). Error indicates that an error has occurred, but does not affect the system to continue running.
6). A serious fatal error may cause abnormal Application Operation.
  The log level sequence is as follows: trace <debug <info <warn <error <fatal
  The specific output of the application depends on the log-level settings (including and above). Usually, the project should be released at the debug level (the log volume is large and it is easy to fill the disk ), after the system is stable, the info level is used.

RPC service log Requirements
Although the above log requirements can locate the problem, there are often the following problems:
  1). Many logs simply record the data snapshots that have been run at or to this point (code line.
  2). A service consists of multiple modules (each module consists of multiple nodes), and logs cannot be connected in tandem.
Good log design must meet
  1 ). take a complete RPC call as the unit (not a snapshot of an execution point, but a complete RPC callback process), and output a complete line of log records, including (time point, source, input parameters, output parameters, intermediate sub-processes, time consumed ).
  2). Introduce logid as the basis for connection between multiple modules.

RPC-level log Solution
Try the following navie method to implement

Public String echo (string MSG) {stringbuilder sb = new stringbuilder (); // *) record input parameter sb. append ("[Request: {MSG: MSG}]"); // *) access the cache service sb. append ("[Action: Access redis, consume 100 ms]"); // *). access the backend database sb. append ("[Action: Dao, consume 100 ms]"); // *). record returned result sb. append ("[response: {MSG}]"); logger.info (sb. tostring (); Return MSG ;}

Note: The echo function here represents an RPC service call interface and simplifies interaction between components. at the same time, stringbuilder is introduced to record the interaction process and time consumption. Finally, logger is used before function exit for unified log input.
However, this method has obvious drawbacks:
  1. Assume that the RPC service function has multiple exits.
  2. nested calls to functions. Process Information of nested subfunctions is required.
See the following code snippet:

Public Boolean verifysession () {// ************ I want to record the log (* ^__ ^ *) * **************} Public String echo (string MSG) {stringbuilder sb = new stringbuilder ();//*) call the sub-process verifysession (); // *) to record the input parameter sb. append ("[Request: {MSG: MSG}]"); // *) access the cache service if (keyvalueengine access fail) {// *********** where the log is recorded ************ throw new exception ();} sb. append ("[Action: Access redis, consume 100 ms]"); // *). access the backend database if (database access fail) {// *********** where the log is recorded ************ throw new exception ();} sb. append ("[Action: Dao, consume 100 ms]"); // *). record returned result sb. append ("[response: {MSG}]"); logger.info (sb. tostring (); Return MSG ;}

Note: To Call The subfunction verifysession, you must upload the stringbuilder object to record the relevant information. Multiple exception exits, you need to add the log input to (this is tedious and easy to forget ). this solution is easy to think of, but not the best one.
  It is undeniable that RPC calls are always in the same thread. Do smart readers have guessed the best solution.
Yes, that isThreadlocalWhich can solve the problem of sub-function calling? LetRPC service frameworkAs the outermost layer of a specific RPC call.

  The dynamic proxy class is used to intercept RPC handler interface calls.

Public class logproxyhandler <t> implements invocationhandler {private t instance; Public logproxyhandler (T instance) {This. instance = instance;} public object createproxy () {return proxy. newproxyinstance (instance. getclass (). getclassloader (), instance. getclass (). getinterfaces (), this) ;}@ override public object invoke (Object proxy, method, object [] ARGs) throws throwable {// *) Intercepts processing before calling the function, initialize loggerutility for threadlocal. beforeinvoke (); // ----- (1) Try {object res = method. invoke (instance, argS); // *) after the function returns successfully, it intercepts processing and outputs loggerutility in a centralized manner. returninvoke (); // ----- (2) return res;} catch (throwable e) {// *) When an exception occurs, the system intercepts the request, enter in the log set // ----- (3) loggerutility. throwableinvode ("[result = Exception: {% s}]", E. getmessage (); throw e ;}}}

Code comments:
    (1) The interception point beforeinvoke is used for the initial call of threadlocal, and the log cache is cleared.
    (2) The interception point returninvoke is used to generate a concentrated log output after the function returns successfully.
    (3) The interception point throwableinvoke is used to output logs in a centralized manner after an exception occurs.
  At the same time, in RPC service calls, the noticelog static function of loggerutility (simple cache intermediate log process) is used instead of the previous stringbuilder. append to record the intermediate sub-process.
The loggerutility code is as follows:

public class LoggerUtility {  private static final Logger rpcLogger = LoggerFactory.getLogger("rpc");  public static final ThreadLocal<StringBuilder> threadLocals = new ThreadLocal<StringBuilder>();  public static void beforeInvoke() {    StringBuilder sb = threadLocals.get();    if ( sb == null ) {      sb = new StringBuilder();      threadLocals.set(sb);    }    sb.delete(0, sb.length());  }  public static void returnInvoke() {    StringBuilder sb = threadLocals.get();    if ( sb != null ) {      rpcLogger.info(sb.toString());    }  }  public static void throwableInvode(String fmt, Object... args) {    StringBuilder sb = threadLocals.get();    if ( sb != null ) {      rpcLogger.info(sb.toString() + " " + String.format(fmt, args));    }  }  public static void noticeLog(String fmt, Object... args) {    StringBuilder sb = threadLocals.get();    if ( sb != null ) {      sb.append(String.format(fmt, args));    }  }}

The combination of the two perfectly solves the above RPC log problem, isn't it awesome.

Logid log Solution
The thrift framework does not have the concept of logid. It is difficult for us to modify the RPC protocol of thrift and add it (for example, Baidu's approach is to use logid as part of the RPC protocol ). the solution here is based on conventions. we adopt the following conventions. All RPC Request Parameters are encapsulated as a specific request object, and all response structures are encapsulated as a specific response object, the first attribute of each request object is logid.
For example, the following structure definition:

struct EchoRequest {  1: required i64 logid = 1001,  2: required string msg}struct EchoResponse {  1: required i32 status,  2: optional string msg}service EchoService {  EchoResponse echo(1: EchoRequest req);}

Note: In the request structure, the logid is defined in the request structure that needs to be added to the RPC.
I always think:Conventions are better than configurations, and conventions are better than frameworks.

Follow-up
Insert the log processing section in the middle, and describe the planned Service Release/subscription later. We will use zookeeper to build a simple system.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.