---------------------------------
Linux Kernel encoding Style
---------------------------------
This short document describes the encoding style used in Linux kernel programming. The encoding style is very personal, so I don't want to impose my opinion on anyone, but at least these rules must be followed in the code I maintain. Of course, I also strongly recommend that you use these rules in addition to the Linux kernel. Even if you do not plan to use it, you should at least consider the points here.
Before you start, I suggest you print a GNU coding standard document first, but instead of reading it, burn it directly. This is definitely a cool attitude.
Let's get started:
---------------------------------
Chapter 1: Indent
---------------------------------
The Tab character (tabs) occupies 8 characters, so the indentation is also 8 characters. But now there is a abnormal ethos that wants to use four characters (or even two characters) of indentation, which is almost the same as setting Pi (circumference rate) to 3.
The fundamental purpose of indentation is to clearly identify the start point of a control block. If you have been staring at the screen for 20 hours, you will be able to see the benefits of longer indentation.
Some people put forward 8-character indentation, which will make the code too biased towards the right, and it is difficult to read when the 80-character terminal is used. My answer is that if you need more than three layers of indentation, you are finished, and you should change your program.
All in all, the 8-character Indentation makes it easier to read the code. When the indentation level of your code is too deep, a warning will appear. You should pay attention to this warning.
Do not place multiple statements on the same line unless you need to hide something:
If (condition) do_this;
Do_something_everytime;
In addition to comments, documents, and kconfig files, spaces should not be used for indentation. The above example shows intentional damage to these two rules.
Finally, you need to find a good editor and do not leave spaces at the end of the line.
---------------------------------
Chapter 2: largest row width
---------------------------------
The main function of specifying the encoding style is to enhance the readability and maintainability of the Code in common tools.
Therefore, the length of a line must be strictly limited to 80 characters. This is a hard rule.
If a statement contains more than 80 characters, it should be divided into multiple rows. The length of each sub-row should not exceed that of the parent row, and the child row should be fully indented to the right relative to the parent row. This rule also applies to function headers that pass the parameter list. Long strings should also be divided into several shorter strings.
Void fun (int A, int B, int C)
{
If (condition)
Printk (kern_warning "warning this is a long printk"
"3 parameters A: % u B: % u"
"C: % u n", A, B, C );
Else
Next_statement;
}
---------------------------------
Chapter 3: brackets
---------------------------------
The issue of brackets is also frequently raised in the C encoding style. Unlike the indentation size, there are not many technical reasons for choosing the angle brackets, but more of them are personal preferences. For example, the disciples of kernighan and Ritchie like to put the left parenthesis at the end of a row and the right parenthesis at the beginning of a row, like this:
If (X is true ){
We do y
}
However, a function is a special case where the left braces of the function are placed at the beginning of the next row, as shown in the following figure:
Int function (int x)
{
Body of Function
}
Those who disagree always point out that this is an inconsistent approach... well... it is indeed not consistent, but all people with normal thinking know that K & R is right. Furthermore, functions are special (you cannot nest functions in C ).
Unless there are unfinished statements behind the right brackets, the right brackets should have a separate row. For example, in the DO statement "while" or if statement "else", it is like this:
Do {
Body of do-Loop
} While (condition );
Or:
If (x = y ){
..
} Else if (x> Y ){
...
} Else {
....
}
In addition to K & R, this bracket layout method reduces the number of empty rows (or almost empty rows), but does not reduce readability. Because the blank lines on your screen cannot Recycle resources (think about 25 lines of terminal screens here), you will have more blank lines for adding comments.
---------------------------------
Chapter 4: naming
---------------------------------
C is a Sparta (concise style) language, so your naming method should also be like this. Unlike Modula-2 and Pascal programmers, C programmers do not use cute names like thisvariableisatemporarycounter. A c programmer will call a variable "tmp". Such a variable name is easier to write and not too difficult to understand.
However, even though everyone is frowning on mixed-case names, the global variable name is required. The name of a global function is "foo.
A global variable (used only when _ true _ is required) must have a descriptive name, which is the same as a global function. If you have a function to count active users, you should call it "count_active_users ()" instead of "cntusr ()".
Adding the function type to the name (the so-called Hungarian naming method) is a manifestation of brain injury. The Compiler knows the type and can check it. Therefore, this naming method will only confuse programmers themselves. It is no wonder that micro-software has made so many bugs.
Local variables should be short and short. If you have a random integer cyclic variable, you 'd better call it "I ". It is inefficient to call it "loop_counter" without obfuscation. Similarly, "tmp" can be used for variables that store temporary values of any type.
If you are worried about obfuscation of your local variables, you will have another problem, the so-called function expansion hormone imbalance syndrome. Please refer to the next chapter.
---------------------------------
Chapter 5: Functions
---------------------------------
The function should be short and sweet, and only one thing should be done. They should only use one or two screens (we all know that the size of the ISO/ANSI standard screen is 80x24) and only do one thing.
The maximum length of a function is inversely proportional to the complexity and indentation level of the function. Therefore, if you have a function that only has a very long (but very simple) case statement and does a few operations on many cases, the length of this function does not matter.
However, if you have a complex function and you are worried that a medium-level high school student may not be able to understand it, you should strictly abide by the maximum length limit, use helper functions with descriptive names (if you think performance is important, you can let the compiler in-line help functions, and the compiler is afraid
It will be better than what you do ).
Another indicator of the function is the number of local variables. The number of local variables should not exceed 10, otherwise there must be something wrong. Design this function and break it down into smaller ones. Generally, a human brain can track 7 different things at the same time. If more than 7 things are found to be dizzy. Although you are smart, you may want to understand the code you wrote two weeks ago.
---------------------------------
Chapter 6: goto statements
---------------------------------
Although many people disagree, the compiler still uses the GOTO statement in the form of unconditional transfer instructions.
When the function needs to exit from multiple locations and some general cleaning work is required, the GOTO statement is very convenient.
Goto has the following benefits:
-Unconditional transfer instructions are easier to understand and track
-Reduced nesting
-Prevent Errors generated when the code is modified because an exit point is not updated.
-Reduced the workload of optimizing redundant code by the compiler :)
Int fun (INT)
{
Int result = 0;
Char * buffer = kmalloc (size );
If (buffer = NULL)
Return-enomem;
If (condition1 ){
While (loop1 ){
...
}
Result = 1;
Goto out;
}
...
Out:
Kfree (buffer );
Return result;
}
---------------------------------
Chapter 7: Notes
---------------------------------
Comments are good, but there is a danger of over-comments. Never explain how your code works in comments: a better way is to write code that is visible in the way you work, and a bad explanation of code is time-consuming.
In general, comments should explain what the code is doing, rather than how it is done. In addition, do not add comments to the function body: if the function is too complex to comment on each part, you may have to read chapter 4 again. You can add some short comments to remind or warn you of some clever (or ugly) practices, but not too much. A better choice is to place comments in the function header to explain what the function is doing, and of course it can also include why it is doing it.
---------------------------------
Chapter 8: Messy code
---------------------------------
This is nothing. We have all met. You may have heard from old Unix users that "GNU Emacs" will automatically align C source code, but the default settings are not very good (in fact, the default settings are worse than random hits, A group of monkeys using GNU Emacs will never make beautiful programs ).
Therefore, you can either completely drop GNU Emacs or adopt more rational settings. If you select the latter, you can add the following code to your. emacs file:
(Defun Linux-C-mode ()
"C mode with adjusted defaults for use with the Linux kernel ."
(Interactive)
(C-mode)
(C-set-style "K & R ")
(Setq C-Basic-offset)
This defines the M-x Linux-C-mode command. When writing a LINUX module, if you put the string "-*-Linux-C-*-" in the first two lines of the file, this mode will be automatically activated. Of course, if you want to automatically activate Linux-C-mode when editing the source file in the/usr/src/Linux directory, you only need. add the following statement to the emacs file.
(Setq auto-mode-alist (Cons' ("/usr/src/Linux. */. *. [CH] $ ".
Linux-C-mode)
Auto-mode-alist ))
But even if you cannot use Emacs, it is not the end of the world: You can also use "indent ".
Once again, GNU indent uses the same brain death settings as GNU Emacs, so you need to give it some command line options. However, this is not too bad, because even the GNU
The authors of indent also realized the authority of K & R (GNU people are not the devil, but they are misled in this case ), so you can run indent with the option "-kr-i8" (indicating "K & R, 8-character indent.
"Indent" has many options, especially the comment layout section. You may want to see its man manual. But remember: "indent" cannot modify bad programs.
---------------------------------
Chapter 9: Configuration File
---------------------------------
Configuration options (such as ARCH/XXX/kconfig, and all other kconfig files) use different indentation methods.
The help text uses 2 Characters of indentation.
If config_experimental
Tristate config_boom
Default n
Help
Apply nitroglycerine inside the keyboard (dangerous)
Bool config_cheer
Depends on config_boom
Default y
Help
Output nice messages when you explode
Endif
Generally, all unstable options should be included by config_experimental, and all options that may damage data (such as experimental write operations in the file system) should be marked as (dangerous), other experimental options should be marked as (experimental ).
---------------------------------
Chapter 10: Data Structure
---------------------------------
Reference counts should be used for the multi-threaded data structure ). In the kernel, garbage collection does not exist (garbage collection outside the kernel is not efficient), which means you _ must _ Use reference count.
The use of reference counting can avoid the use of locks, so that different users can use the data structure in parallel-there is no need to worry that the structure will suddenly disappear due to sleep.
Note that locking _ is not a substitute for _ reference count. Locking is used to ensure the integrity of the data structure, and reference count is a memory management technology. Generally, you need both of them, and there should be no ambiguity.
Some data structures may use two-layer reference counts when different "classes" are used. Counts the number of users of all sub-classes. When the Count of sub-classes is zero, the total number is reduced by one.
This example of "multi-layer reference count" can be found in Memory Management Code ("struct mm_struct": mm_users and mm_cout) and file system code ("struct super_block": s_count and s_active.
Remember: if another thread can see your data structure, but you haven't counted it by reference, there will almost certainly be bugs.
---------------------------------
Chapter 2: macros, enumerations, inline functions and RTL
---------------------------------
Macros, constants, and enumeration labels should all appear in the form of uppercase.
# Define constant 0x12345
We recommend that you use enumeration to define a group of associated constants.
We recommend that you use upper-case macro names, but you can use lower-case macros with functions similar to functions.
In general, we recommend using inline functions instead of macros similar to functions.
Macros with multi-line statements should be included in a do-while block:
# Define macrofun (A, B, C)
Do {
If (A = 5)
Do_this (B, c );
} While (0)
When using macros, avoid the following situations:
1) macros will affect the control flow:
# Define Foo (X)
Do {
If (blah (x) <0)
Return-ebuggered;
} While (0)
The above example is a very bad practice. It looks like a function, but it may cause the function called to exit early. Therefore, never interrupt the execution of the original function in a macro.
2) macro-dependent local variables:
# Define Foo (VAL) bar (index, Val)
This statement may seem like nothing wrong, but it is easy for people who read the code to confuse it. In addition, a casual change may also cause code errors.
3) a macro with parameters used in the left mode:
Foo (x) = y;
If you define a macro as above, once someone wants to convert Foo into an inline function, the trouble will come.
4) forgetting the computing priority: constant expressions defined in macro format should be included in brackets. The same is true for macro parameters.
# Define constant 0X4000
# Define constexp (constant | 3)
The CPP Manual contains all the knowledge that should be paid attention to when using macros. The GCC Manual also contains the RTL knowledge that is often used with assembly in the kernel.
---------------------------------
Chapter 4: Print kernel messages
---------------------------------
Kernel developers should all like to be considered as cultural, so do not forget the spelling check of kernel messages, which is the foundation of the user's good impression. Do not use nonstandard spelling, such as "Dont". Instead, use "do not" or "don't ".
Kernel messages do not have to be aborted within a period of time.
When printing kernel messages, avoid placing numbers in parentheses, such as (% d), which causes a null value to be printed.