Deep Linux kernel Architecture Appendix a< architecture-Related Knowledge

Deep Linux kernel Architecture Appendix a< architecture-Related Knowledge > Notes

Last Update:2014-08-18 Source: Internet

Author: User

Tags prev

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

A.1 Overview
　　To facilitate scaling to the new architecture, the kernel strictly isolates architecture-related and architecture-independent code. Processor-specific portions of the kernel, header files containing definitions and prototypes are stored in the include/asm-arch/(for example, include/asm-arm/) directory, while the C language and assembler source code implementations are saved in arch/arch/(for example, arch/arm/ ) directory.

　　The binder system also takes into account that the generic code may need to be based on an architecture-specific mechanism. All processor-specific header files are located in include/asm-arch/. After the kernel is configured for a specific architecture, the symbolic link include/asm/to the directory that corresponds to the specific hardware. The kernel accesses the architecture-specific header file through #include<asm/file.h>.

A.2 Data Types
　　The kernel distinguishes the following three basic data types.
　Standard data types for > C languages;
　> data type with fixed bit number, such as U32,S16;
　> Subsystem-specific types, such as pid_t,sector_t.

a.3 Alignment

　　Aligning data to a specific memory address is necessary for efficient processor caching and performance improvement. In general, alignment refers to byte addresses that are divisible into byte lengths that are aligned to the data type. The length of the byte that aligns the data type to itself is called natural alignment.

in some parts of the kernel, you may need to access non-aligned data types. 　　 The architecture must define two macros for this purpose.
　> get_unaligned (PTR): A pointer to a non-aligned memory location, which is used to reverse-reference the operation.

　>put_unaligned (val,ptr): Pointer to ptr specifies a non-aligned memory location write value val.

when GCC organizes memory layouts for various structs and unions, the appropriate alignment is automatically selected, and the programmer is not required to do this manually. 　　

A.4 Memory Page
　> page_shift Specifies the base 2 logarithm of the page length.
　> page_size Specifies the length of the memory page, in bytes.

　>page_align (addr) can align any address to the page boundary.

It is also necessary to implement two standard operations on the page, usually through optimized assembly instructions. 　　
　> clear_page (Start) deletes a page starting from start and populates it with 0 bytes.

　>copy_page (To, from) copies the page data from at the to point.

　　The page_offset macro Specifies the location where the physical page frames are mapped in the virtual address space. On most architectures, this implicitly defines the length of the user's address space, but does not apply to all address spaces. Therefore, you must use the task_size constant to determine the length of the user space.

/* page_offset-the virtual address of the start of the kernel image * Task_size-the maximum SIZE of a user space Tas K. * Task_unmapped_base-the Lower boundary of the mmap VM area */#define PAGE_OFFSETUL (config_page_offset) #define Task_s IZE (UL (Config_page_offset)-ul (0x01000000)) #define TASK_UNMAPPED_BASE (UL (config_page_offset)/3)

a.5 system Call
　　The mechanism for making a system call is actually a controllable switchover from user space to kernel space, which is different on all supported platforms. standard file Unistd.h (Arch/arm/include/asm/unistd.h on ARM) is responsible for the tasks associated with system calls.

a.6 String Processing
　　Strings are processed everywhere in the kernel, so the time required for string processing is strict. Many architectures provide specialized assembly instructions to perform the required tasks, or because manual-optimized assembly code may be faster than compiler-generated code, all architectures are <arch/arch/include/asm/string.h> defines the various string operations of its own . For example, in an ARM architecture,

 #ifndef __ASM  _arm_string_h#define __asm_arm_string_h/* * We don ' t do inline STRING functions, since the * optimised inline ASM versions is not small. */#define __have_arch_strrchrextern char * STRRCHR (const char * s, int c); #define __have_arch_strchrextern char * STRCHR (c Onst char * s, int c); #define __have_arch_memcpyextern void * memcpy (void *, const void *, __kernel_size_t); #define __have _arch_memmoveextern void * Memmove (void *, const void *, __kernel_size_t); #define __have_arch_memchrextern void * MEMCHR (c onst void *, int, __kernel_size_t), #define __have_arch_memsetextern void * memset (void *, int, __kernel_size_t); extern voi D __memzero (void *ptr, __kernel_size_t n); #define MEMSET (P,v,n) ({void *__p = (p); size_t __n = N;if ((__n)! = 0) {if (__b Uiltin_constant_p ((v)) && (v) = = 0) __memzero ((__p), (__n)), Elsememset ((__p), (v), (__n));} (__p);}) #endif

All of these actions are used to replace a function of the same name in the C standard library used in user space to perform the same task in the kernel. 　　 For each string operation that has an architecture itself defined in an optimized form, the corresponding __have_arch_operation macro must be defined. The functions of the above arm architecture are defined in arch\arm\lib , and the assembly is implemented.

a.7 Thread Representation

the running state of a thread is defined primarily by the contents of the processor register. Processes that are not currently running, you must save the data in the appropriate data structure so that when the scheduler activates, the data is read from and migrated to the appropriate registers. The structure that is used to complete the work is defined in the following files.
> ptrace.h The pt_regs structure that holds all registers is defined, and the PT_REGS structure instance that holds each register value is placed on the kernel stack when the process is switched from the user state to the kernel state.
> processor.h contains thread_struct struct , which describes all other registers and all other process state information.
> thread.h defines thread_info struct , which contains all the TASK_STRUCT members that must be accessed to enter and exit the kernel state and assemble the code.

　　The definition of Pt_regs under ARM architecture:

/* * This struct defines the registers is stored on the * stack during a system call.  Note that sizeof (struct pt_regs) * Have to is a multiple of 8. */struct Pt_regs {long uregs[18];}; #define ARM_CPSRUREGS[16] #define ARM_PCUREGS[15] #define ARM_LRUREGS[14] #define ARM_SPUREGS[13] #define Arm_ipuregs[ #define ARM_FPUREGS[11] #define ARM_R10UREGS[10] #define ARM_R9UREGS[9] #define ARM_R8UREGS[8] #define Arm_r7uregs[7 ] #define ARM_R6UREGS[6] #define ARM_R5UREGS[5] #define ARM_R4UREGS[4] #define ARM_R3UREGS[3] #define Arm_r2uregs[2]# Define ARM_R1UREGS[1] #define ARM_R0UREGS[0] #define ARM_ORIG_R0UREGS[17]

thread_struct definition under ARM architecture (machine instructions can be stored in opcode form with memory addresses for debugging purposes): 　　

Union DEBUG_INSN {u32arm;u16thumb;}; struct Debug_entry {u32address;union debug_insninsn;}; struct Debug_info {intnsaved;struct debug_entrybp[2];}; struct Thread_struct {/* fault info  */unsigned longaddress;unsigned longtrap_no;unsigned longerror_code;/* debugging  */struct Debug_infodebug;};

A. 8-bit operation and byte order
　　The kernel architecture-specific bit operations are defined in arch/arch/include/asm/ bitops.h , which is the bit operation definition under the ARM architecture:

#ifndef __armeb__/* * These is the little endian, atomic definitions. */#define SET_BIT (nr,p) atomic_bitop_le (set_bit,nr,p) #define CLEAR_BIT (nr,p) Atomic_bitop_le (clear_bit,nr,p) # Define Change_bit (Nr,p) atomic_bitop_le (change_bit,nr,p) #define TEST_AND_SET_BIT (nr,p) Atomic_bitop_le (test_and_ SET_BIT,NR,P) #define TEST_AND_CLEAR_BIT (nr,p) atomic_bitop_le (test_and_clear_bit,nr,p) #define Test_and_change_bit (nr,p) Atomic_bitop_le (test_and_change_bit,nr,p) #define FIND_FIRST_ZERO_BIT (P,sz) _find_first_zero_bit_le (P,SZ) #define Find_next_zero_bit (P,sz,off) _find_next_zero_bit_le (p,sz,off) #define FIND_FIRST_BIT (P,sz) _find_first_bit_le (P,sz ) #define FIND_NEXT_BIT (P,sz,off) _find_next_bit_le (p,sz,off) #define WORD_BITOFF_TO_LE (x) ((x)) #else/* * These is the Big endian, atomic definitions. */#define SET_BIT (nr,p) atomic_bitop_be (set_bit,nr,p) #define CLEAR_BIT (nr,p) Atomic_bitop_be (clear_bit,nr,p) # Define Change_bit (Nr,p) atomic_bitop_be (change_bit,nr,p) #define TEST_AND_SET_BIT (nr,p) atomic_bitop_be (test_and_ Set_bit, nr,p) #define TEST_AND_CLEAR_BIT (nr,p) atomic_bitop_be (test_and_clear_bit,nr,p) #define TEST_AND_CHANGE_BIT (NR,P) Atomic_bitop_be (test_and_change_bit,nr,p) #define FIND_FIRST_ZERO_BIT (P,sz) _find_first_zero_bit_be (P,SZ) #define Find_next_zero_bit (P,sz,off) _find_next_zero_bit_be (p,sz,off) #define FIND_FIRST_BIT (P,sz) _find_first_bit_be (P,sz ) #define FIND_NEXT_BIT (P,sz,off) _find_next_bit_be (p,sz,off) #define WORD_BITOFF_TO_LE (x) ((x) ^ 0x18) #endif

The kernel provides little_endian.h and Big_endian.h header files. 　　 The version that is used for the current processor is contained in the asm-arch/byteorder.h , and the small end format is converted as follows, big-endian similar:

#define __CONSTANT_HTONL (x) ((__force __be32) ___constant_swab32 ((x))) #define __CONSTANT_NTOHL (x) ___constant_swab32 ((__force __be32) (x)) #define __CONSTANT_HTONS (x) ((__force __be16) ___constant_swab16 ((x))) #define __CONSTANT_NTOHS ( x) ___constant_swab16 ((__force __be16) (x)) #define __CONSTANT_CPU_TO_LE64 (x) ((__force __le64) (__u64) (x)) #define __ CONSTANT_LE64_TO_CPU (x) ((__force __u64) (__le64) (x)) #define __CONSTANT_CPU_TO_LE32 (x) ((__force __le32) (__U32) (x)) # Define __CONSTANT_LE32_TO_CPU (x) ((__force __u32) (__LE32) (x)) #define __CONSTANT_CPU_TO_LE16 (x) ((__force __le16) (__ U16) (x)) #define __CONSTANT_LE16_TO_CPU (x) ((__force __u16) (__LE16) (x)) #define __CONSTANT_CPU_TO_BE64 (x) (__force __ BE64) ___constant_swab64 (((x))) #define __CONSTANT_BE64_TO_CPU (x) ___constant_swab64 ((__force __u64) (__be64) (x)) # Define __CONSTANT_CPU_TO_BE32 (x) ((__force __be32) ___constant_swab32 ((x))) #define __CONSTANT_BE32_TO_CPU (x) ___ Constant_swab32 ((__force __u32) (__BE32) (x)) #define __CONSTANT_CPU_TO_BE16 (x) ((__force __BE16) ___constant_swab16 (((x))) #define __CONSTANT_BE16_TO_CPU (x) ___constant_swab16 ((__force __u16) (__BE16) (x)) # Define __CPU_TO_LE64 (x) ((__force __le64) (__u64) (x)) #define __LE64_TO_CPU (x) ((__force __u64) (__le64) (x)) #define __ CPU_TO_LE32 (x) ((__force __le32) (__U32) (x)) #define __LE32_TO_CPU (x) ((__force __u32) (__LE32) (x)) #define __cpu_to_ Le16 (x) ((__force __le16) (__U16) (x)) #define __LE16_TO_CPU (x) ((__force __u16) (__LE16) (x)) #define __CPU_TO_BE64 (x) ((_ _force __be64) __swab64 ((x))) #define __BE64_TO_CPU (x) __swab64 ((__force __u64) (__be64) (x)) #define __CPU_TO_BE32 (x) (( __force __be32) __swab32 ((x))) #define __BE32_TO_CPU (x) __swab32 ((__force __u32) (__BE32) (x)) #define __CPU_TO_BE16 (x) ( (__force __be16) __swab16 ((x))) #define __BE16_TO_CPU (x) __swab16 ((__force __u16) (__BE16) (x))

a.9 Page Table
　　To simplify memory management, the kernel provides a memory model that abstracts different architectures from each other, and the various porting editions must provide functions to manipulate page tables and page table entries . These statements are in asm-arch/pgtable.h .

a.10 Miscellaneous
a.10.1 Calibration and calculation
　　 calculating checksums on packets is the key to communicating over IP networks, which can be time consuming. If possible, each architecture should use manual assembly code to calculate checksums. The declaration of the related code is in asm-arch/checksum.h . Of these, there are two function-type most important:
　> unsigned short ip_fast_csum calculates the necessary checks and the length of the packet based on the IP header and header lengths.
　> csum_partial calculates checksums for each packet, based on the individual shards received sequentially.

a.10.2 Context Switch
　　After the scheduler decides to notify the current process to discard the CPU so that another process runs, the hardware-related parts of the context switch are made. To do this, all architectures must provide a switch_to function or a corresponding macro . The prototype is as follows, and the declaration is defined in asm-arch/system.h .

/* * SWITCH_TO (prev, next) should switch from Task ' prev ' to ' next ' * ' prev ' would never be the same as ' next '.  Schedule () itself * contains the memory barrier to tell GCC not to cache ' current '. */extern struct task_struct *__switch_to (struct task_struct *, struct thread_info *, struct thread_info *); #define Switch_ to (Prev,next,last) does {last = __switch_to (Prev,task_thread_info (prev), Task_thread_info (Next)) and} while (0)

a.10.3 Finding the current process
　　The current macro is used to locate a pointer to the task_struct of the currently running process. Each architecture must declare the macro in arch/arch/include/asm/current.h . The pointer is stored in a separate processor register and can be queried either directly or indirectly using the current macro. Most architectures use it to hold a pointer to the currently valid Thread_info instance, because the Thread_info struct contains a pointer to the task_struct,current of the related process that can be implemented around a bend. For example, the implementation of the ARM architecture:
Arch/arm/include/asm/current.h:

Static inline struct task_struct *get_current (void) __attribute_const__;static inline struct task_struct *get_current ( void) {return current_thread_info ()->task;} #define CURRENT (Get_current ())

　 arch/arm/include/asm/thread_info.h 　 :

/* How to get the thread information struct from C */static inline struct thread_info *current_thread_info (void) __attri bute_const__;static inline struct thread_info *current_thread_info (void) {Register unsigned long SP asm ("SP"); return ( struct Thread_info *) (SP & ~ (thread_size-1));}

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More