GCC features in Linux Kernel

Last Update:2018-12-03 Source: Internet

Author: User

Tags deprecated prefetch

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

GCC and Linux are excellent combinations. Although they are independent software, Linux relies entirely on GCC to run on the new architecture. Linux also utilizes the features in GCC (calledExtension) To achieve more functions and optimization. This article discusses some important extensions and explains how to use them in the Linux kernel.

The current stable GCC version (version 4.3.2) supports three C-standard versions:

International Organization for Standardization (ISO) original C language standard (ISO c89 or C90)
ISO C90 with correction 1
Current ISO c99 (this is the default standard used by GCC, and this standard is also assumed in this article)

Note:This document assumes that the ISO c99 standard is used. If you specify an earlier standard than ISO c99, some extensions described in this article may not be used. You can use-stdSpecifies the actual standard used by GCC. You can use the GCC manual to view which extensions are supported in the standard version (see the reference documentation ).

Available versions

This article focuses on using gcc extensions in 2.6.27.1 Linux kernel and GCC 4.3.2. Each C extension references a file in the Linux kernel source code. You can find the example in it.

You can classify available c extensions in several ways. This article divides them into two categories:

FunctionalityExtensions provide new features.
OptimizationExtended to help generate more efficient code.

Function Scaling

First, we will discuss some GCC extensions for the Standard C language.

Type discovery

GCC allows you to identify types by referencing variables. This Operation SupportsGeneric programming. Similar functions can be found in many modern programming languages, such as C ++, Ada, and Java. LinuxtypeofBuildminAnd
maxDepends on the type of operation. Listing 1 shows how to usetypeofBuild a generic macro (see./Linux/include/Linux/kernel. h ).

List 1. UsetypeofBuild a generic macro

#define min(x, y) ({\typeof(x) _min1 = (x);\typeof(y) _min2 = (y);\(void) (&_min1 == &_min2);\_min1 < _min2 ? _min1 : _min2; })

Extended Range

GCC support scope, which can be used in many aspects of C language. One of them isswitch/caseBlockcaseStatement. In complex condition structuresifThe statement implementation is the same as that in Listing 2 (see./Linux/Drivers/SCSI/SD. c), but Listing 2 is more concise. Useswitch/caseYou can also use the jump table to implement Compiler optimization.

Listing 2.caseScope of use in statement

static int sd_major(int major_idx){switch (major_idx) {case 0:return SCSI_DISK0_MAJOR;case 1 ... 7:return SCSI_DISK1_MAJOR + major_idx - 1;case 8 ... 15:return SCSI_DISK8_MAJOR + major_idx - 8;default:BUG();return 0;/* shut up gcc */}}

You can also use a range for initialization, as shown below (see./Linux/ARCH/CRIS/arch-v32/kernel/SMP. C ). In this example,spinlock_tCreateLOCK_COUNT. Each element of the array is initializedSPIN_LOCK_UNLOCKEDValue.

/* Vector of locks used for various atomic operations */spinlock_t cris_atomic_locks[] = { [0 ... LOCK_COUNT - 1] = SPIN_LOCK_UNLOCKED};

The range also supports more complex initialization. For example, the following code specifies the initial values of several sub-ranges in the array.

int widths[] = { [0 ... 9] = 1, [10 ... 99] = 2, [100] = 3 };

Zero-length Array

In the C standard, at least one array element must be defined. This requirement often complicate the code design. However, GCC supports the concept of a zero-length array, which is particularly useful for structure definition. This concept is similar to flexible array members in ISO c99, but uses different syntaxes.

The following example declares an array with no members at the end of the structure (see./Linux/Drivers/ieee1394/raw1394-private.h ). This allows elements in the structure to reference the memory that follows the structure instance. This feature is useful when you need a variable number of array members.

struct iso_block_store {        atomic_t refcount;        size_t data_size;        quadlet_t data[0];};

Determine the call address

In many cases, you need to determine the caller of a given function. GCC provides built-in functions for this purpose__builtin_return_address. This function is usually used for debugging, but it has many other purposes in the kernel.

The following code shows,__builtin_return_addressReceivelevel. This parameter defines the call stack level for obtaining the return address. For example, if you specifylevelIs0Is the return address of the current function. If you specify
levelIs1Is the return address of the function to be called, and so on.

void * __builtin_return_address( unsigned int level );

In the following example (see./Linux/kernel/softirq. C ),local_bh_disableFunction disables Soft Interrupt on the local processor, thus prohibiting softirqs, tasklets, and bottom halves from running on the current processor. Use__builtin_return_addressCapture the return address to use this address for future tracking.

void local_bh_disable(void){        __local_bh_disable((unsigned long)__builtin_return_address(0));}

Constant Detection

During compilation, you can use a built-in function provided by GCC to determine whether a value is a constant. This information is very valuable because it can be used to construct expressions that can be optimized by constant folding.__builtin_constant_pFunction is used to detect constants.

__builtin_constant_pThe prototype is as follows. Note,__builtin_constant_pIt cannot detect all constants, because GCC is not easy to prove whether some values are constants.

int __builtin_constant_p( exp )

Linux uses constant detection quite frequently. In the example shown in listing 3 (see./Linux/include/Linux/log2.h), use constant detection for optimization.roundup_pow_of_twoMacro. If the expression is a constant, use a constant expression that can be optimized. If the expression is not a constant, call another macro function to round up the value to the power of 2.

Listing 3. Using constant detection to optimize macro functions

#define roundup_pow_of_two(n)\(\__builtin_constant_p(n) ? (\(n == 1) ? 1 :\(1UL << (ilog2((n) - 1) + 1))\   ) :\__roundup_pow_of_two(n)\)

Function attribute

GCC provides many function-level attributes that can be used to provide more data to the compiler to help the compiler perform optimization. This section describes the attributes associated with functions. The next section describes the attributes that affect optimization.

As shown in Listing 4, attributes use other symbol definitions to specify aliases. For more information about how to use attributes, see./Linux/include/Linux/compiler-gcc3.h ).

Listing 4. Function attribute Definition

# define __inline__     __inline__      __attribute__((always_inline))# define __deprecated           __attribute__((deprecated))# define __attribute_used__     __attribute__((__used__))# define __attribute_const__     __attribute__((__const__))# define __must_check            __attribute__((warn_unused_result))

The definitions shown in Listing 4 are some function attributes available in GCC. They are also the most useful function attributes in the Linux kernel. The following explains how to use these attributes:

always_inlineEnables GCC to concurrently process specified functions, regardless of whether optimization is enabled.
deprecatedIndicates that the function has been deprecated and should not be used again. If you try to use an obsolete function, you will receive a warning. You can also apply this attribute to types and variables to encourage developers to use them as little as possible.
__used__Tell the compiler whether or not GCC finds the call instance of this function to use this function. This is helpful for Calling C functions from assembly code.
__const__Tells the compiler that a function is stateless (that is, it uses the parameter passed to it to generate the result to be returned ).
warn_unused_resultLet the compiler check whether all callers check the function results. This ensures that the caller can properly check the function results and handle errors as appropriate.

The following is an example of using these attributes in the Linux kernel.deprecatedThe example is from a kernel (./Linux/kernel/resource. c) unrelated to the architecture ),constThe example is from the source code of the IA64 kernel (./Linux/ARCH/IA64/kernel/unwind. C ).

int __deprecated __check_region(struct resource     *parent, unsigned long start, unsigned long n)static enum unw_register_index __attribute_const__     decode_abreg(unsigned char abreg, int memory)

Optimized Scaling

Now we will discuss some GCC features that help generate better machine codes.

Branch Prediction prompt

One of the most common optimization technologies in Linux kernel is__builtin_expect. When developers use conditional code, they often know which branch is most likely to be executed, and which branch is rarely executed. If the compiler knows this prediction information, it can generate the best code around the branch that is most likely to be executed.

As shown below,__builtin_expectIs based on two macros.likelyAndunlikely(See./Linux/include/Linux/compiler. h ).

#define likely(x)__builtin_expect(!!(x), 1)#define unlikely(x)__builtin_expect(!!(x), 0)

Use__builtin_expectThe compiler can make command selection decisions that match the provided prediction information. This makes the executed code as close as possible to the actual situation. It can also improve the cache and command line.

For example, if a condition is labeled with "likely", the compiler can place the true part of the Code directly after the branch instruction (so that the branch instruction is not required ). Using branch commands to access the false part of the condition structure is not the optimal method, but it is unlikely to be accessed. In this way, the code is optimal for the most likely situation.

Listing 5 provides a usage example.likelyAndunlikelyMacro functions (see./Linux/NET/CORE/datax. C ). This function predictssumThe variable will be zero (the packet'schecksumIs valid), and
ip_summedVariable not equalCHECKSUM_HW.

Listing 5. Examples of likely and unlikely macros

unsigned int __skb_checksum_complete(struct sk_buff *skb){        unsigned int sum;        sum = (u16)csum_fold(skb_checksum(skb, 0, skb->len, skb->csum));        if (likely(!sum)) {                if (unlikely(skb->ip_summed == CHECKSUM_HW))                        netdev_rx_csum_fault(skb->dev);                skb->ip_summed = CHECKSUM_UNNECESSARY;        }        return sum;}

Pre-capture

Another important way to improve performance is to cache necessary data close to the processor. Caching can significantly reduce the time required to access data. Most modern processors have three types of memory:

Level-1 cache usually supports single-cycle access
Second-level cache supports two-period Access
System memory supports longer access times

To minimize access latency and improve performance, it is best to put the data in the nearest memory. Manual execution of this task is calledPre-capture. GCC uses built-in functions__builtin_prefetchSupports manual pre-capturing of data. Use this function to cache data before data is needed. As shown below,__builtin_prefetchThe function receives three parameters:

Data address
rwParameter, which indicates whether the pre-captured data is used for read or write operations.
localityParameter, which specifies whether the data should be left in the cache or cleared after the data is used.

void __builtin_prefetch( const void *addr, int rw, int locality );

Prefetch is often used in linux kernels. Usually pre-capturing is used through macro and package functions. Listing 6 is an example of a helper function that uses the built-in function package (see./Linux/include/Linux/prefetch. h ). This function provides a pre-capturing mechanism for stream operations. Using this function can usually reduce cache missing and pauses, thus improving performance.

Listing 6. Range prefetch wrapper Functions

#ifndef ARCH_HAS_PREFETCH#define prefetch(x) __builtin_prefetch(x)#endifstatic inline void prefetch_range(void *addr, size_t len){#ifdef ARCH_HAS_PREFETCHchar *cp;char *end = addr + len;for (cp = addr; cp < end; cp += PREFETCH_STRIDE)prefetch(cp);#endif}

Variable attributes

In addition to the Function Attributes discussed earlier in this article, GCC also provides attributes for variable and Type Definitions. One of the most important attributes isalignedAttribute, which is used to realize object alignment in memory. In addition to being important to performance, some device or hardware configurations also require object alignment.alignedThe attribute has a parameter that specifies the desired alignment type.

The following example is used to suspend the software (see./Linux/ARCH/i386/MM/init. C ). Define when page alignment is requiredPAGE_SIZEObject.

char __nosavedata swsusp_pg_dir[PAGE_SIZE]__attribute__ ((aligned (PAGE_SIZE)));

The example in listing 7 describes two optimizations:

packedAttribute to package elements of a structure to minimize the space they occupy. This means that if you definecharVariable, which occupies no more than one byte (8 bits ). Bit fields are compressed into one bit without occupying more storage space.
This source code uses__attribute__The statement is optimized. Multiple Attributes are defined in a comma-separated list.

Listing 7. Structure packaging and setting multiple attributes

static struct swsusp_header {        char reserved[PAGE_SIZE - 20 - sizeof(swp_entry_t)];        swp_entry_t image;        char    orig_sig[10];        char    sig[10];} __attribute__((packed, aligned(PAGE_SIZE))) swsusp_header;

Conclusion

This article only discusses several GCC features that can be used in the Linux kernel. You can use the GNU gcc manual to learn more about all the extensions for C and C ++ languages (see
References ). In addition, although these extensions are often used in the Linux kernel, they can also be used in your own applications. With the development of GCC, new extensions will certainly emerge, which will further improve the performance and increase the functions of the Linux kernel.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More