Kernel log and printk Structure Analysis

Source: Internet
Author: User
Tags syslog

Article Source: http://blog.csdn.net/tangkegagalikaiwu/article/details/8572365

I. printk Overview

For Linux kernel developers, printk is no longer familiar with it. Most of the information displayed when the kernel is started is implemented by her, and most of the time it is used for Kernel Driver debugging. One reason why she is so widely used is that she is easy to use, and another important reason is her robustness. It is widely used and can be called almost anywhere in the kernel. You can call her in the interrupt context or process, or call her when holding a lock. You can also call her in the SMP system without using the chain during the call. This good adaptability comes from her design, a simple" Ring
Buffer
".
Note that "almost" can be used anywhere in the kernel. So where will there be a "problem "? That is, in the early stage of the system startup process, some areas before the terminal initialization can be used, but all information is cached in the simple ring of printk before the terminal and console are initialized.
Buffer (ring buffer), until the terminal and console are initialized, all cache information is output together.

If you want to debug the initial part of the startup process (such as setup_arch (), you can use the hardware devices that can work at this time (such as the serial port) to communicate with the outside world and use printk () early_printk () function. She has the ability to print data on the terminal at the early stage of the startup process, which is similar to prink, The difference is that(1) function name (2) Ability to work earlier (output information) (3) she has her own small cache (generally 512b) (4) One-time output to the hardware device, information is no longer retained in the form of ring buffer. However, this function cannot be implemented in some architectures, so this method lacks portability. (Most architectures, including x86 and arm ).
Therefore, unless output on the terminal at the early stage of startup, we think that printk () can work under any circumstances. This can be seen from the kernel startup code that the kernel version information will be printed through printk shortly after the start_kernel entry: printk (kern_notice
"% S", linux_banner );.

Ii. Use of printk
Printk () and printf () in the C library are used. Major differencesThat is, printk () specifies the log level.
(1) log level the kernel determines whether to print messages on the terminal (console) based on the log level: the kernel displays all messages with a lower level than a specific value on the terminal (console. However, all information is recorded in the "ring" of printk.
Buffer.
Printk has eight loglevels, which are defined in <Linux/kernel. h>:
# Define kern_emerg "<0>"/* system unavailable */# define kern_alert "<1>"/* immediate action required */# define kern_crit "<2> "/ * critical condition */# define kern_err "<3>"/* error condition */# define kern_warning "<4>"/* warning condition */# define kern_notice "<5> "/* normal situation, however, it is worth noting that */# define kern_info "<6>"/* Information-type message */# define kern_debug "<7>"/* debugging-level information * // * use the default Kernel log Level */# define kern_default "<D>"/** is marked as a "continuous" log print output line (only used after one * No \ n closed row ). it can only be used for initial core/arch Code * (otherwise, it is not SMP secure ). */# define kern_cont "<C>"

If no log level is specified during use, the kernel selects default_message_loglevel, which is defined in kernel/printk. C:

/* printk's without a loglevel use this.. */#define DEFAULT_MESSAGE_LOGLEVELCONFIG_DEFAULT_MESSAGE_LOGLEVEL
It can be seen that this level can be specified during Kernel configuration. This mechanism is available from 2.6.39. If you do not configure this level, the default value is <4>, that is, kern_warning. The kernel sets the most important record level kern_emerg as "<o>", and the irrelevant debug record level "kern_debug" as "<7> ".
The kernel uses these macros to specify the Log Level and the record level of the current terminal console_loglevel to determine whether to print data to the terminal. The following is an example:
printk(KERN_EMERG "log_level:%s\n", KERN_EMERG);
After the compilation and preprocessing are completed, the code in prepare is actually compiled into the following format:
printk( "<0>" "log_level:%s\n", KERN_EMERG);

To give a printk () What log level is entirely dependent on you. For formal and maintained messages, the log level should be given based on the nature of the information. However, there are two ways to temporarily add debugging information everywhere to solve a problem:
One option is to keep Default record level of the TerminalUnchanged. All debugging information Kern crit or lower levels are provided to ensure that the information is output. In another method, the Kern debug level is provided for all debugging information, and the default record level of the terminal is adjusted to 7. All debugging information can also be output. The two methods have their own advantages and disadvantages.
The default log level for Kernel adjustment is described here 《
Linux Device Driver Learning (2)-debugging technology, which can be changed through/proc/sys/kernel/printk, or C program calls syslog system calls. But now the glibc function interface is changed. Because syslog is widely used, the function name is changed to klogctl, so the code in the original blog cannot be used, the new lele_loglevel code is as follows:
#include <stdio.h>#include <stdlib.h>#include <string.h>#include <errno.h>//#define __LIBRARY__ /* _syscall3 and friends are only available through this *///#include <linux/unistd.h>/* define the system call, to override the library function *///_syscall3(int, syslog, int, type, char *, bufp, int, len);int main(int argc, char **argv){      int level;      if (argc == 2) {            level = atoi(argv[1]); /* the chosen console */      } else {            fprintf(stderr, "%s: need a single arg\n", argv[0]);            exit(1);      }      if (klogctl(8, NULL, level) < 0) {            fprintf(stderr, "%s: syslog(setlevel): %s\n",                  argv[0], strerror(errno));            exit(1);      }      exit(0);}

(2) related auxiliary macrosIf you determine the log level required by printk, it seems a little troublesome to add the above macros to it each time. Therefore, kernel hackers define some macros to facilitate the use of printk. These macros are also everywhere in kernel code:

# Ifndef pr_fmt # define pr_fmt (FMT) FMT # endif # define pr_emerg (FMT ,...) \ printk (kern_emerg pr_fmt (FMT), ##__ va_args _) # define pr_alert (FMT ,...) \ printk (kern_alert pr_fmt (FMT), ##__ va_args _) # define pr_crit (FMT ,...) \ printk (kern_crit pr_fmt (FMT), ##__ va_args _) # define pr_err (FMT ,...) \ printk (kern_err pr_fmt (FMT), ##__ va_args _) # define pr_warning (FMT ,...) \ printk (kern_warning pr_fmt (FMT), ##__ va_args _) # define pr_warn pr_warning # define pr_notice (FMT ,...) \ printk (kern_notice pr_fmt (FMT), ##__ va_args _) # define pr_info (FMT ,...) \ printk (kern_info pr_fmt (FMT), ##__ va_args _) # define pr_cont (FMT ,...) \ printk (kern_cont FMT, ##__ va_args _)/* pr_devel () Unless debug is defined () no code is generated */# ifdef debug # define pr_devel (FMT ,...) \ printk (kern_debug pr_fmt (FMT), ##__ va_args _) # else # define pr_devel (FMT ,...) \ no_printk (kern_debug pr_fmt (FMT), ##__ va_args _) # endif/* if you are writing a driver, use dev_dbg */# If defined (Debug) # define pr_debug (FMT ,...) \ printk (kern_debug pr_fmt (FMT), ##__ va_args _) # Elif defined (config_dynamic_debug)/* dynamic_pr_debug () uses pr_fmt () internally so we don't need it here */# define pr_debug (FMT ,...) \ dynamic_pr_debug (FMT, ##__ va_args _) # else # define pr_debug (FMT ,...) \ no_printk (kern_debug pr_fmt (FMT), ##__ va_args _) # endif

From the above Code, you should know how to use these macros. It is worth noting that the pr_devel and pr_debug macros generate the actual printk Code only after the debug is defined. This facilitates kernel development: these macros are used in the code, when debugging is complete, simply # UNDEF debug can eliminate the code used for debugging, without actually deleting the debugging output code.
(3) Output Rate Control

During debugging, some part of the printk may produce a large amount of output, which causes the system to fail to work normally and may cause system logsRing Buffer Overflow(The old information is quickly overwritten ). In particular, when using a slow console device (such as a serial port), excessive output can also slow down the system. In this way, it is difficult to find out where the system goes wrong. Therefore, you should be very careful: during normal operations, nothing should be printed, and the output should be an exception that requires attention, and be careful not to overhead it.

In some cases, it is best to set a flag Variable Representation: This information has been printed and will not be printed in the future.. For Print speedThe control kernel of has provided a ready-made macro:
#define printk_ratelimit() __printk_ratelimit(__func__)

This function should be called before you think that printing a message may contain a large number of duplicated messages. If this function returns a non-zero value, continue printing your message; otherwise, skip it. Typical calls are as follows:

if (printk_ratelimit())    printk(KERN_NOTICE "The printer is still on fire\n");

Printk_ratelimitYou can trace the number and time of messages sent to the console. When the output exceeds a limit,
Printk_ratelimit starts to return 0 to discard the message. You can modify the following parameters:

  1. /Proc/sys/Kern/printk_ratelimit)
  2. /Proc/sys/kernel/printk_ratelimit_burst (the maximum number of messages in the above cycle, if the number exceeds printk_ratelimit (), return 0)
To control the output of messages.
In <Linux/kernel. h> defines other macros, such as printk_ratelimited (FMT ,...) if you are interested, go to the file and check it.

(4) last special note: 1. Although printk is robust, you will know the source code, This function is very inefficient. : Only one byte is copied at a time during character copying, and calling the console output may cause interruptions. Therefore, if your driver performs a performance test or release after function debugging is complete, remember to minimize the printk output, so that only a small amount of information is output when an error occurs. Otherwise, the output of useless information to the console will affect the performance.I made such an idiot mistake when I first started to learn the driver. When I tested the performance of the CAN driver, printk sent a message to check the result and went down directly. 2. The temporary cache printk_buf of printk is only 1 K. All printk functions can only record <1 k of information to log buffer at a time, and printk uses the "Ring buffer"


Iii. kernel Implementation of printk
Linux printk kernel logs are often called Kernel ring BufferThis is because the cache Implementation of printk uses a simple ring
Buffer (ring buffer ). However, note that, Do not confuse with the kernel trace system ring BufferAlthough they are all for tracking and debugging, The Ring buffer Implementation of the trace system is more complete and complex, while the ring buffer used by printk is very simple, Actually, a character array is defined.:
static char __log_buf[__LOG_BUF_LEN];


A set of pointers are used for management:

/** When pointing to log_buf, there is no restrictions on log_buf_len. Therefore, they * must use a mask before being used as the subscript (remove the high positions above config_log_buf_shift) */static unsigned log_start; /* index in log_buf: The next character read by syslog () */static unsigned con_start;/* index in log_buf: The next character sent to the console */static unsigned log_end; /* index in log_buf: recently written character address + 1 */

The specific implementation Cu has already been written by a Boyou, so I will not be embarrassed here. I have reprinted the backup:Implementation Analysis of printk

Iv. user space access to kernel logs
User space access and control kernel logs have two interfaces: (1) Calling the kernel syslog system through the glibc klogctl function interface (2) using fs/proc/kmsg. the procfs interface exported from the C kernel module:/proc/kmsg file. In fact, they finally called the do_syslog function in/kernel/printk. C to implement access to _ log_buf and modification of related variables. However, it is worth noting that: When data is obtained from/proc/kmsg, the read data in _ log_buf will not be retained (that is, the log_start pointer will be modified ),
However, the syslog system calls log data and keeps the data to other processes.
Reading/proc files is the default practice of klogd. The dmesg command can be used to view the cached content and retain it. In fact, it returns all the content in _ log_buf to stdout, regardless of whether it has been read. Here I recommend that you use rtfsc-read the fucking source
Code, you can see that these codes are better than anything else. I will only introduce them here.

A program exists in the user space for recording system logs, Collectively referred to as" Syslog daemon". Most of the early and current embedded systems used klogd + syslogd combinations. Now most distributions use rsyslogd or syslogd-ng. I will not analyze these user space programs here. I am not good at it and may be clear about O & M. I only know the approximate call relationship between them.
Here I will use a figure to summarize the overall structure of the kernel printk and the log system:



We strongly recommend that you read kernel log: API
And implementation-logs from kernel to user space


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.