@ Zheng summary creation date: 2012/10
# Consciousness
ASAP (As Soon As Possible) Principle
When there is a strange problem online and you realize that the problem cannot be located by the existing logs, when the problem is difficult to reproduce in your development environment, please do not stick to the Code, because: 1) not necessarily because of your code logic, dirty data, old business data, distributed environment, or other subsystems. 2) online businesses are in unstable conditions and cannot be located indefinitely. At this time,
Please immediately go to the problem-related call chain, one-time:
- Print logs at the function entry and exitPrint Input and Output Parameters
- Catch (){......} Print stacktrace,Print the key variable values in the try block at the same time (to avoid the first cause of an exception, but you do not know what the variable is causing)
- Input parameters are printed at the interface entry for interaction with other modules,
That is,
To solve online problems, log and a lot of log output are the final reasons!Do not hesitate on the strength of logging, our engineers are used to looking for two functions to print logs, package and deploy them one by one, don't see them, find a few more functions to print, re-deploy, wait for the phenomenon to repeat and then observe ,......, As time passes by, the small accidents that customer service knows have become a major accident that is widely known across the country. So again:
In your call chain, detailed logs are printed at the entry and exit of the function called layer by layer. You can deploy the function and wait for the phenomenon to reappear!
What do we want to record?1) time required to complete an operation
It can be used to track why the system response is slow or fast.
- The time spent processing an incoming request, accurate to milliseconds
- Time when the database query is executed
- Time when data is obtained from a disk or storage medium
- And so on.
2) exceptions and stack tracking 3) Sessions knows who is causing a problem, so it is essential to use session identifiers in logs. It can be simply an IP address or a more complex UUID, as long as it can distinguish different requestors. 4) version number
# Tools
Recommended Java Logging framework1) log4j: Our configuration is log4j. appender. CONSOLE. layout. conversionPattern = [%-d {yyyy-MM-dd HH \: mm \: ss. SSS}] [% p] [% c] [% m] % n; % p indicates the log priority, % c indicates the category name, and % m indicates the output information, % n is the carriage return line break. 2) logback: The log4j creator Ceki gülc ü subsequently launched SLF4J + logback. As an alternative to commons-Logging, SLF4J (Simple logging Facade for Java) provides a Simple and unified interface for various logging APIs, this allows end users to configure the desired logging APIs implementation during deployment. Logback is superior to performance. It is said that "some key operations, such as determining whether to record a log statement, have significantly improved its performance. This operation takes 3 seconds in logback and 30 seconds in log4j. Logback creates a logger faster: 13 milliseconds, while 23 milliseconds in log4j. More importantly, it only takes 94 nanoseconds to obtain the existing recorder, while log4j requires 2234 nanoseconds, reducing the time To 1/23. Compared with java. util. logging (JUL), performance improvement is also significant ".
# Configuration
Do not randomly find a log4j configuration file from the Internet. Make sure you understand every configuration itemSince we output logs, we naturally expect that
Does this problem occur from the past few days?"In this case, you will not find that your rollingPolicy error settings can only view logs in the last few hours, or the log occurrence time is not accurate to milliseconds. Log4j configuration in the production environment of the main site will be posted later.
# Concept
Logs extracted using grep: independent lines!We always want to use grep to process log files. This means:
A log entry should never span multiple lines unless you are printing a stack. We will use grep to ask the log question? For example:
- What IP addresses did customers who place orders with their mobile phone number 13910 ****** come from in the last three days?
- The browsing address is ****? From = kfapi customers, but referral is a search engine domain name. How many times has it been in the last three days?
- In the last week, how long does it take for the order center to execute all the transactions?
- Does the xxx interface actually send a request at? What are the parameters we receive?
Make sure that your log can answer this question.
Write different log files in different fields of interestWhen access and calls are extremely frequent, sometimes you will find that printing all the information in your project into a log file will make you feel dizzy. The simplest example is that Apache's access logs and error logs are separated. Similarly, you can separate quiet events (occasionally) from noisy events. For example, the open platform can print three log files: connection log (link creation and link closure, with access parameters), message log (Internal call chain ), stacktrace log (abnormal stack printing ).
# Specific implementation
Accurate to at least millisecondsThe log must contain a timestamp, accurate to at least milliseconds. If it is recorded in seconds, we once knew that the Code had bugs due to lack of concurrency control, but we could only look down at the logs accurate to seconds. For Java, it is best to configure yyyy-MM-dd/HH: mm: ss. SSS.
Print a clear session ID whenever possibleA certain session identifier is printed in log entries. When many concurrent requests are called, you can filter the client based on this field. For example, we use the SI Nan log to print the UUID in a browser cookie.
Judge isDebugEnabled of log4jIf the print information is a constant string or a simple string, if (log. isDebugEnabled () is not required ()). If the assembled actions consume resources, use if (log. isDebugEnabled ()).
If possible, standardize the performance data output.This makes it easier for grep or hadoop to extract and mine performance data, which can be easily converted to graph monitoring. For example, the performance data format of the Order center is:
Branchmark Start Time of the current node [duration of the current node, time consumed by the current node, time proportion in the parent node]
Locations where performance detection points need to be deployed(1) access the dao layer of the database; (2) access the ext layer of external resources; (3) Access the mq method; (4) and so on, performance monitoring logs must be added to all components (external) that you are not in charge of, or any performance hazards that you consider your project.
# Sample
A good startup logThe version number of the application, the session ID of the client, and the execution duration of key steps are printed.
A good stack trace logThis article first in the bystanders-zheng yu 55 Best Practices series, link: http://www.cnblogs.com/zhengyun_ustc/archive/2012/12/15/logging_bp.html
Reference resources:1. Sweet Potato and Best Practices for Logging log recording. Original English Article 2. Julius Davies: Log4j Best Practices. Reasons for transferring 10 entries to logback in this article [PPT] 1) 55 Best Practices Series: MongoDB Best Practices)
2) 55 Best Practices Series: Logging Best Practices)
3)
1 image: