Logging in multithreaded applications

Source: Internet
Author: User
Tags pow
No software is completely error-free, and application users may encounter unexpected results during the running of the program. To analyze and identify the causes of these problems, one of the methods that programmers use extensively is logging. In this article, you will learn how to use a circular buffer to efficiently log records through memory operations, rather than file operations. Choose the appropriate size for the buffer to ensure the dump-related messages, which will be helpful when debugging.
Introduction
"If there are two ways to write a program without errors, then only a third way is valid." "-alanj.perlis
In the lifetime of critical computer applications, logging is a very important activity, especially when the symptoms of the failure are not obvious. Logging provides detailed information about the application state before the failure, such as the value of the variable, the return value of the function, and so on. During a period of operation, a large amount of trace data is constantly generated and continuously written to a text file on disk. For effective logging, you need to use a lot of disk space, and in a multithreaded environment, the amount of disk space you need increases exponentially, because a large number of threads are recording their tracking information.
The two major problems with using regular files for logging are the availability of hard disk space and the slow disk I/O when writing data to a file. Continuous write operations to the disk can greatly degrade the performance of the program, causing it to run slowly. Typically, you can resolve space problems by using the log rotation policy, saving the log in several files, truncating and overwriting the file size when it reaches a predefined number of bytes.
To overcome space problems and minimize disk I/O, some programs can record their trace data in memory and dump the data only when requested. This circular, in-memory buffer is called a cyclic buffer. This paper discusses some common implementations of cyclic buffers, and puts forward some viewpoints on the mechanism of cyclic buffer activation in multithreaded programs.
Loop buffer
A circular buffer is a logging technique for applications that can store related data in memory instead of writing it to a file on disk every time. You can dump the data in memory to disk when you need it (for example, when a user requests to dump memory data into a file, when a program detects an error, or when the program crashes because of an illegal operation or received signal). Circular buffer logging consists of a fixed sized memory buffer that the process uses to log records. As the name suggests, the buffer is implemented in a circular fashion. When the buffer fills the data, it does not need to allocate more memory for the new data, but rather writes it from the beginning of the buffer, overwriting the previous content. Refer to the example in Figure 1.
Figure 1. Write operations on a circular buffer
  
Figure 1 shows the state of the buffer after the two entries are written to the loop buffer. After writing the first log entry (in blue), when the process attempts to write to the second entry (expressed in red), there is not enough space left in the buffer. The process writes data, reaches the end of the buffer, and then copies the remaining data to the beginning of the buffer, overwriting the previous log entries.
  
See resources for an example of a circular buffer implementation. By saving a read pointer, you can implement a read operation on the circular buffer and move the read and write pointers accordingly to ensure that the read pointer does not cross the write pointer during the read operation. To improve efficiency, some applications can save raw data, rather than formatted data, to the buffer. In this case, a parser is needed that can generate meaningful log messages based on these memory dumps.
The benefits of a loop buffer why use a circular buffer when you can simply write to a file. Because you covered the previous contents of the loop buffer, you will lose the previous data after you complete the operation. The circular buffer provides the following advantages over the traditional file logging mechanism.
Fast. Memory writes much faster than I/O operations on disk. Refresh the data only when needed.
Continuous logging can fill the system with space, causing other programs to run out of space and fail. In this case, you have two options, either to manually delete the log information or to implement the log rotation policy.
Once you have enabled logging, the process will continuously populate the space on your hard disk, regardless of whether you need it or not.
Sometimes you just need the relevant data before the program crashes, not the complete historical data for the process.
Some common debugging functions, such as printf, write, and so on, may change the behavior of a program in the case of multithreaded applications, making them difficult to debug. Using these functions can cause your application to hide certain errors that might appear in the ordinary. These functions are all revocable points and can cause a pending signal that the program does not expect in the thread environment. This is illustrated more clearly in the example (pseudocode) that is assumed in Listing 1 and in Listing 2 below.
Listing 1. No code to enable debugging
Pthread_setcanceltype (Pthread_cancel_deferred,null);
/*ishouldnotbecancelledinthebelowsection*/
var=5;
#ifdefDEBUG
Write (FD, "valueofvar=5n", 17);
#endif
Var=pow (var,2);
/*icanbecancellednow*/
Pthread_testcancel ();
Listing 2. Code with debugging enabled
Pthread_setcanceltype (Pthread_cancel_deferred,null);
/*ishouldnotbecancelledinthebelowsection*/
var=5;
#ifdefDEBUG
Write (FD, "valueofvar=5n", 17);
#endif
Var=pow (var,2);
/*icanbecancellednow*/
Pthread_testcancel ();
Using a loop buffer in a multithreaded program
Sometimes, you can use circular buffer logging when other traditional logging methods fail. This section describes some important aspects to consider when using circular buffers to enable logging in multithreaded applications.
Synchronization is always an integral part of multithreaded programs when accessing a public resource, and logging is no exception. Because each thread attempts to write to the global space, you must ensure that they are written to memory synchronously, otherwise the message is corrupted. Typically, each thread holds a lock before writing to the buffer, releasing the lock when the operation is complete. You can download an example of a loop buffer that uses locks to write to memory.
This approach has the following drawbacks: If your application contains several threads, and each thread is logging in detail, the overall performance of the process will be affected because the threads will spend most of their time getting and releasing the lock.
By enabling each thread to write data to its own block of memory, you can completely avoid synchronization problems. When a request is received from a user for dump data, each thread obtains a lock and dumps it to a central location. Because locks are acquired only when data is flushed to disk, performance is not significantly affected. In such cases, you may need an additional program to sort the log information (in the order of the thread ID or timestamp) to parse it. You can also choose to write only the message code in memory, not the full formatted message, and later use an external utility to parse the dump data and convert the message code to meaningful text.
Another way to avoid synchronization problems is to allocate a large chunk of global memory and divide it into smaller slots where each slot can be used by one thread for logging. Each thread is only able to read and write its own slot, not the entire buffer. When each thread attempts to write data for the first time, it tries to find an empty memory slot and marks it as busy. When a thread obtains a specific slot, the corresponding bit in the bitmap for tracking slot usage can be set to 1, and when the thread exits, the bit is reset to 0. You also need to maintain a global list of slot numbers currently in use and thread information for the threads that are using it.
To avoid a situation where a thread is dead, but does not set the corresponding bit in its slot bitmap to 0, you need a garbage collector thread that traverses the global list and polls at regular intervals based on the thread ID. It is responsible for releasing slots and modifying the global list. See the example shown in Listing 3 below.
   Listing 3. Sample pseudocode for garbage collector threads
Voidcheck_and_free (list*ptr) {
Intslotno,ret_val;
Locklist ();
while (PTR) {
if (((Ret_val=pthread_kill (ptr->thread_id,0)) ==esrch)) {
/*threadhasdied*/
slotno=ptr->slotno;
Free_slot (ptr->thread_id);
Mark_bitmap_free (SLOTNO);
}
ptr=ptr->next;
}
Unlocklist ();
Return
}
Typically, the thread ID is reused, so it is possible that a thread has died without releasing the corresponding slot, and that the thread ID is used again and assigned a new slot before the garbage collector releases the slot. For new threads, it is important to check the global list and reuse the same slot (if it was used by previous instances). Because the garbage collector thread and the writer thread may try to modify the global list at the same time, some kind of locking mechanism is also required.
When a user signals a process to dump circular buffer data, the thread that handles the signal will not allow other threads to change the contents of the buffer and dump the contents of the slot in the file. Listing 4 and Listing 5 show an example of writing data to a circular buffer and dumping its contents into a file. Once you receive a signal to dump the data, the is_dumping global variable is used to prevent other threads from changing the contents of the buffer.
   Listing 4: Example pseudocode for writing to the loop buffer slot "I"
Voidwrite_to_buffer (char*msg) {
Read_atomically (&is_dumping);
if (!is_dumping)
memcpy (Slot->ptr,msg,strlen (msg));
Return
}
   Listing 5. Sample Pseudocode for dumping cyclic buffer data
Voiddump_data (INTFD) {
Change_atomically_to_true (&is_dumping);
foriineach_used_slot{
Write_slot_data_to_file (Fd,slot);
}
Change_atomically_to_false (&is_dumping);
Return
}
Conclusion
The circular buffer makes logging more efficient by using memory operations instead of file operations. Select the appropriate size for the buffer to ensure that the dump-related messages are helpful when the procedure is analyzed afterwards. For those programs that keep logging on, the loop buffer is an ideal solution, and you don't need complete historical information for the program during debugging. This paper introduces the method of implementing loop buffer in multithreaded program and some contents needing attention.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.