[Linuxforum posting]
If you have any mistakes or omissions, please correct them.
======================================
Linux Kernel gnu c Extension
======================================
Gnc cc is a powerful cross-platform C compiler that provides many extensions to the C language,
These extensions provide strong support for optimization, target code layout, and more secure checks. In this article
The C Language Supporting GNU extension is called gnu c.
The Linux kernel code uses a large number of gnu c extensions, so that it can compile the unique compiler of the Linux kernel.
The interpreter is GNU cc. In the past, it even occurred that a special gnu cc version was used to compile the Linux kernel.
Status. This article is a summary of the gnu c extension used by the Linux kernel.
When you do not understand the syntax and semantics, you can find a preliminary answer from this article. For more information, see
Gcc.info. The example in this article is taken from Linux 2.4.18.
Statement expression
============
Gnu c regards compound statements contained in parentheses as an expression called a statement expression.
Currently, where expressions are allowed, you can use loops, local variables, and so on in statement expressions.
It can only be used in compound statements. For example:
+++ Include/Linux/kernel. h
159: # define min_t (type, x, y )/
160: ({type _ x = (x); Type _ y = (y); _ x <_ y? _ X: _ y ;})
+++ Net/IPv4/tcp_output.c
654: int full_space = min_t (INT, TP-> window_clamp, tcp_full_space (SK ));
The last statement of a compound statement should be an expression, and its value will be the value of this statement expression.
A secure macro for minimum value is defined here. In Standard C, it is usually defined:
# Define min (x, y) (x) <(y )? (X): (y ))
This definition calculates X and Y twice. When the parameter has side effects, an incorrect result is generated.
The statement expression calculates parameters only once, avoiding possible errors. Statement expressions are usually used for macro definition.
Typeof
======
To use the macro defined in the previous section, you need to know the parameter type. You can use typeof to define a more general macro.
The parameter type must be known in advance, for example:
+++ Include/Linux/kernel. h
141: # define min (x, y )({/
142: const typeof (x) _ x = (x );/
143: const typeof (y) _ y = (y );/
144: (void) (& _ x = & _ y );/
145: _ x <_ y? _ X: _ y ;})
In this example, typeof (x) indicates the value type of X. Row 142nd defines a local variable of the same type as X.
The value _ x is converted to X. Note that the function of row 144th is to check whether the types of parameters X and Y are the same.
Typeof can be used in any type and is usually used for macro definition.
Zero-length Array
============
Gnu c allows zero-length arrays. This feature is useful when defining the header structure of a variable-length object. Example
For example:
+++ Include/Linux/minix_fs.h
85: struct minix_dir_entry {
86: _ 2010inode;
87: Char name [0];
88 :};
The last element of the structure is defined as a zero-length array, which does not occupy the space of the structure. In standard C
The length of the defined array is 1, and the size of the calculated object is complicated during allocation.
Variable Parameter macro
============
In gnu c, macros can accept variable parameters, just like functions, such:
+++ Include/Linux/kernel. h
110: # define pr_debug (FMT, Arg ...)/
111: printk (kern_debug FMT, # Arg)
Here Arg indicates the remaining parameters, which can be zero or multiple.
To replace ARG in macro extension. For example:
Pr_debug ("% s: % d", filename, line)
Extended
Printk ("<7>" "% s: % d", filename, line)
The reason for using ## is that when processing Arg does not match any parameter, the ARG value is blank
C Preprocessor in this special case, discard # the previous comma, so that
Pr_debug ("success! /N ")
Extended
Printk ("<7>" success! /N ")
Note that there is no comma at the end.
Label element
==========
Standard C requires that the initial values of arrays or structure variables must appear in a fixed order. In gnu c
The specified index or structure domain name allows the initialization value to appear in any order. The method for specifying an array index is
Before the initialization value, write '[Index] ='. to specify a range, use the format of '[first... last] =,
For example:
++ ARCH/i386/kernel/IRQ. c
1079: static unsigned long irq_affinity [nr_irqs] = {[0... NR_IRQS-1] = ~ 0ul };
Convert all elements of the array ~ 0ul, which can be seen as a shorthand form.
To specify a structure element, write 'fieldname: 'before the element value. For example:
+++ Fs/ext2/file. c
41: struct file_operations ext2_file_operations = {
42: llseek: generic_file_llseek,
43: Read: generic_file_read,
44: Write: generic_file_write,
45: IOCTL: ext2_ioctl,
46: MMAP: generic_file_mmap,
47: open: generic_file_open,
48: Release: ext2_release_file,
49: fsync: ext2_sync_file,
50 };
Initialize the element llseek of the structure ext2_file_operations to generic_file_llseek,
The element read is initialized to genenric_file_read, and so on. I think this is the GNU C extension.
One of the best features is that when the definition of the structure changes and the offset of the element changes, this initialization method is still
Ensure the correctness of known elements. For elements that are not in initialization, the initial value is 0.
Case range
==========
Gnu c allows you to specify a continuous range value in a case label. For example:
+++ ARCH/i386/kernel/IRQ. c
1062: Case '0'... '9': C-= '0'; break;
1063: Case 'A'... 'F': C-= 'a'-10; break;
1064: Case 'A'... 'F': C-= 'a'-10; break;
Case '0'... '9 ':
Equivalent
Case '0': Case '1': Case '2': Case '3': Case '4 ':
Case '5': Case '6': Case '7': Case '8': Case '9 ':
Declared special attributes
====================
Gnu c allows you to declare special attributes of functions, variables, and types for manual code optimization and more careful generation.
Code check. To specify a declared attribute, write it after the declaration
_ Attribute _ (attribute ))
Attribute is an attribute description. Multiple Attributes are separated by commas. Gnu c supports more than a dozen attributes.
To introduce the most commonly used:
* Noreturn
The noreturn attribute is used for a function, indicating that the function is never returned. This allows the compiler to generate slightly optimized
Code, the most important thing is to eliminate unnecessary warning information, such as the variable that is not initially made. For example:
+++ Include/Linux/kernel. h
47: # define attrib_noret _ attribute _ (noreturn ))....
61: asmlinkage noret_type void do_exit (long error_code)
Attrib_noret;
* Format (archetype, string-index, first-to-check)
The format attribute is used by the function, indicating that the function uses parameters in the printf, scanf, or strftime style.
Number. The most common error with this type of function is that the format string does not match the parameter. You can specify the format attribute
Let the compiler check the parameter type based on the format string. For example:
+++ Include/Linux/kernel. h?
89: asmlinkage int printk (const char * FMT ,...)
90: _ attribute _ (format (printf, 1, 2 )));
Indicates that the first parameter is a format string, and the parameters are checked based on the format string from the second parameter.
* Unused
The unused attribute is used for functions and variables, indicating that the function or variable may not be used. This attribute can avoid
The compiler generates warning information.
* Section ("section-name ")
Attribute section is used for functions and variables. Generally, the compiler places the function in the. text section and the variable in
In the. Data or. BSS section, the Section attribute allows the compiler to place functions or variables in the specified
Section. For example:
+++ Include/Linux/init. h
78: # DEFINE _ init _ attribute _ (_ Section _ (". Text. init ")))
79: # DEFINE _ exit _ attribute _ (unused, _ Section _ (". Text. Exit ")))
80: # DEFINE _ initdata _ attribute _ (_ Section _ (". Data. init ")))
81: # DEFINE _ exitdata _ attribute _ (unused, _ Section _ (". Data. Exit ")))
82: # DEFINE _ initsetup _ attribute _ (unused ,__ section _ (". setup. init ")))
83: # DEFINE _ init_call _ attribute _ (unused ,__ section _ (". initcall. init ")))
84: # DEFINE _ exit_call _ attribute _ (unused ,__ section _ (". exitcall. Exit ")))
The connector can arrange code or data in the same section together. This technology is very popular in Linux kernel,
For example, the system initialization code is arranged in a separate section and can be released after initialization.
Memory.
* Aligned (alignment)
Attribute aligned is used for variables, structures, or union types, specifying variables, structure fields, structures, or union
Qi, in bytes, for example:
++ Include/asm-i386/processor. h
294: struct i387_fxsave_struct {
295: Unsigned short CWD;
296: Unsigned short SWD;
297: Unsigned short TWD;
298: Unsigned short fop;
299: Long FIP;
300: Long FCS;
301: Long Foo;
......
308: }__ attribute _ (aligned (16 )));
It indicates that the variable of this structure type is 16 bytes aligned. Generally, the compiler selects an appropriate alignment, indicating
Fixed alignment is usually caused by System Restrictions, optimization, and other reasons.
* Packed
Attribute packed is used for variables and types. When used for variables or structure fields, it indicates the minimum possible alignment.
During enumeration, structure, or union type, it indicates that this type uses the smallest memory. For example:
++ Include/asm-i386/DESC. h
51: struct xgt_desc_struct {
52: Unsigned short size;
53: Unsigned Long Address _ attribute _ (packed ));
54 :};
The domain address will be allocated immediately after the size. The purpose of attribute packed is mostly to define hardware-related knots.
Structure, so that there is no holes between elements due to alignment.
Current function name
============
Gnu cc predefines two specifiers to save the name of the current function, __function _ Save the function in the source code
In, __pretty_function _ saves names with language characteristics. In the C function, the two
The name is the same. In the C ++ function, __pretty_function _ includes additional functions such as function return types.
Information, Linux kernel only uses _ function __.
+++ Fs/ext2/super. c
98: void ext2_update_dynamic_rev (struct super_block * SB)
99 :{
100: struct ext2_super_block * es = ext2_sb (SB)-> s_es;
101:
102: If (le32_to_cpu (ES-> s_rev_level)> ext2_good_old_rev)
103: return;
104:
105: ext2_warning (SB, _ FUNCTION __,
106: "updating to rev % d because of new feature flag ,"
107: "Running e2fsck is recommended ",
108: ext2_dynamic_rev );
Here _ function _ will be replaced with the string "ext2_update_dynamic_rev ". Although
_ Function _ looks similar to _ file __in Standard C, but actually _ function __
Is replaced by the compiler, unlike _ file.
Built-in functions
==========
Gnu c provides a large number of built-in functions, many of which are the built-in versions of Standard C library functions, such
Memcpy, which has the same functions as the corresponding C-database functions. This article does not discuss such functions. Other built-in functions
The name usually starts with _ builtin.
* _ Builtin_return_address (level)
The built-in function _ builtin_return_address returns the return address of the current function or its caller, parameter
Level indicates the number of frames searched on the stack. 0 indicates the return address of the current function, and 1 indicates the current function.
The return address of the caller, and so on. For example:
++ Kernel/sched. c
437: printk (kern_err "schedule_timeout: Wrong timeout"
438: "Value % lx from % P/N", timeout,
439: _ builtin_return_address (0 ));
* _ Builtin_constant_p (exp)
The built-in function _ builtin_constant_p is used to determine whether a value is a compilation constant. If the Parameter
The exp value is a constant, and the function returns 1; otherwise, 0 is returned. For example:
++ Include/asm-i386/bitops. h
249: # define test_bit (NR, ADDR )/
250: (_ builtin_constant_p (NR )? /
251: constant_test_bit (NR), (ADDR )):/
252: variable_test_bit (NR), (ADDR )))
Many computation or operations are more optimized when the parameter is constant. The above method can be used in gnu c.
Compile only the constant version or a very number version based on whether the parameter is a constant.
Compile the optimal code when the parameter is a constant.
* _ Builtin_ct (exp, c)
Built-in function _ builtin_expect is used to provide branch prediction information for the compiler. Its return value is an integer table.
The value of maxcompute exp. The value of C must be the compile time. For example:
+++ Include/Linux/compiler. h
13: # define likely (x) _ builtin_exact CT (x), 1)
14: # define unlikely (x) _ builtin_exact CT (x), 0)
++ Kernel/sched. c
564: If (unlikely (in_interrupt ())){
565: printk ("Scheduling in interrupt/N ");
566: Bug ();
567 :}
The semantics of this built-in function is that the expected value of exp is C, and the compiler can sort it according to this information.
The sequence of statement blocks enables programs to run more efficiently as expected. The preceding example indicates that the instance is in
The disconnection context rarely occurs. The target code of lines 565th-566 may be placed in a distant location to ensure
The target code that is frequently executed is more compact.