Introduction: Most Java programmers use some sort of tracking system to track potential bugs and problems with code in development. However, multithreading and multi-platform environments can generate a lot of puzzling trace data. In this article, software engineer Daniel Would provides some tips to help you understand the trace data that is generated in a complex application. You will learn how to use the Open source logging package log4j to generate a log file that contains rich information. You will also see how to use the standard UNIX shell commands to mine the information data you need.
Scott Clee shows you how to track and log records in a custom class in the article "Use a consistent trace system for easier debugging" to provide a consistent tracking method across all applications. This approach is ideal for individual applications running on a single thread, but when things get more complicated, you'll soon find yourself facing a whole bunch of trace data that has no clue how to track what actually happened.
Successful tracking requires two elements. The first element is to output useful enough information. The second element is to output information in a way that you can get the information you need. After all, you may be exporting a whole bunch of data that is unreadable and incomprehensible. Thus, this article will be divided into two parts: one section focuses on generating trace data, and the other focuses on querying the result information in a specific way.
You have two choices for log/track. You can, of course, use a "self-conceived" technique, or you can use a prepared support class (both proprietary and free), which provides a wide range of functionality. An example of a free implementation is log4j, which has been well proven, fast and versatile. It provides a comprehensive infrastructure and easy to understand configuration. Therefore, I have chosen it as the basis for this example. In this article, I'll guide you through the use of log4j and apply it to a multithreaded, multi-platform work example.
Log4j is easy to set up: all you have to do is provide log4j JAR files to each part of the application to start logging (see Downloading and installing log4j). The use of log4j has been documented in its documentation, and some articles have studied its basic usage (see Resources for related links). In this article, I'll discuss things to consider when logging in a complex environment, and show how to use the functionality provided in log4j in the context of a working example. I'll also show you how to mine the information after you collect the log information. Even with the best plans in the world, finding the information you need in a large log file can still be a time-consuming and tricky process.
Why is multithreading complicated?
If the program is running on multiple threads at the same time, the log information for each thread may be intertwined. Anyone who tries to read from the output of 15 interlaced threads knows what I'm talking about! In addition, the code can often run on multiple machines in a distributed system. If you use the same tracking system for all components, you will be forced to coordinate multiple outputs of your system, potentially facing the risk of internal clock synchronization. (This last question is really a hassle.) Imagine a situation where you read the entire trace, unaware that the timestamp on the client machine was 20 seconds earlier than the corresponding timestamp on the server! )
Listing 1 is an example of the trace information generated by a single program that runs using simple output, with timestamps and messages in the output. It's very similar to the output you might see when you use your own design logging infrastructure.
Listing 1. Tracking information for a simple application
(12:01:02) Starting
(12:01:02) initializing variables
(12:01:03) making call to db
(12:01:04) got value 'A'
(12:01:05) setting new value to db
(12:01:05) done