A.1 Overview
To facilitate scaling to the new architecture, the kernel strictly isolates architecture-related and architecture-independent code. Processor-specific portions of the kernel, header files containing definitions and prototypes are stored in the include/asm-arch/(for example, include/asm-arm/) directory, while the C language and assembler source code implementations are saved in arch/arch/(for example, arch/arm/ ) directory.
The binder system also takes into account that the generic code may need to be based on an architecture-specific mechanism. All processor-specific header files are located in include/asm-arch/. After the kernel is configured for a specific architecture, the symbolic link include/asm/to the directory that corresponds to the specific hardware. The kernel accesses the architecture-specific header file through #include<asm/file.h>.
A.2 Data Types
The kernel distinguishes the following three basic data types.
Standard data types for > C languages;
> data type with fixed bit number, such as U32,S16;
> Subsystem-specific types, such as pid_t,sector_t.
a.3 Alignment
Aligning data to a specific memory address is necessary for efficient processor caching and performance improvement. In general, alignment refers to byte addresses that are divisible into byte lengths that are aligned to the data type. The length of the byte that aligns the data type to itself is called natural alignment.
in some parts of the kernel, you may need to access non-aligned data types. The architecture must define two macros for this purpose.
>
get_unaligned (PTR): A pointer to a non-aligned memory location, which is used to reverse-reference the operation.
>put_unaligned (val,ptr): Pointer to ptr specifies a non-aligned memory location write value val.
when GCC organizes memory layouts for various structs and unions, the appropriate alignment is automatically selected, and the programmer is not required to do this manually.
A.4 Memory Page
>
page_shift Specifies the base 2 logarithm of the page length.
>
page_size Specifies the length of the memory page, in bytes.
>page_align (addr) can align any address to the page boundary.
It is also necessary to implement two standard operations on the page, usually through optimized assembly instructions.
>
clear_page (Start) deletes a page starting from start and populates it with 0 bytes.
>copy_page (To, from) copies the page data from at the to point.
The
page_offset macro Specifies the location where the physical page frames are mapped in the virtual address space. On most architectures, this implicitly defines the length of the user's address space, but does not apply to all address spaces. Therefore, you must use the
task_size constant to determine the length of the user space.
/* page_offset-the virtual address of the start of the kernel image * Task_size-the maximum SIZE of a user space Tas K. * Task_unmapped_base-the Lower boundary of the mmap VM area */#define PAGE_OFFSETUL (config_page_offset) #define Task_s IZE (UL (Config_page_offset)-ul (0x01000000)) #define TASK_UNMAPPED_BASE (UL (config_page_offset)/3)
a.5 system Call
The mechanism for making a system call is actually a controllable switchover from user space to kernel space, which is different on all supported platforms.
standard file Unistd.h (Arch/arm/include/asm/unistd.h on ARM) is responsible for the tasks associated with system calls.
a.6 String Processing
Strings are processed everywhere in the kernel, so the time required for string processing is strict. Many architectures provide specialized assembly instructions to perform the required tasks, or because manual-optimized assembly code may be faster than compiler-generated code, all architectures are
<arch/arch/include/asm/string.h> defines the
various string operations of its own . For example, in an ARM architecture,
#ifndef __ASM _arm_string_h#define __asm_arm_string_h/* * We don ' t do inline STRING functions, since the * optimised inline ASM versions is not small. */#define __have_arch_strrchrextern char * STRRCHR (const char * s, int c); #define __have_arch_strchrextern char * STRCHR (c Onst char * s, int c); #define __have_arch_memcpyextern void * memcpy (void *, const void *, __kernel_size_t); #define __have _arch_memmoveextern void * Memmove (void *, const void *, __kernel_size_t); #define __have_arch_memchrextern void * MEMCHR (c onst void *, int, __kernel_size_t), #define __have_arch_memsetextern void * memset (void *, int, __kernel_size_t); extern voi D __memzero (void *ptr, __kernel_size_t n); #define MEMSET (P,v,n) ({void *__p = (p); size_t __n = N;if ((__n)! = 0) {if (__b Uiltin_constant_p ((v)) && (v) = = 0) __memzero ((__p), (__n)), Elsememset ((__p), (v), (__n));} (__p);}) #endif
All of these actions are used to replace a function of the same name in the C standard library used in user space to perform the same task in the kernel. For each string operation that has an architecture itself defined in an optimized form, the corresponding __have_arch_operation macro must be defined. The functions of the above
arm architecture are defined in
arch\arm\lib , and the assembly is implemented.
a.7 Thread Representation
the running state of a thread is defined primarily by the contents of the processor register. Processes that are not currently running, you must save the data in the appropriate data structure so that when the scheduler activates, the data is read from and migrated to the appropriate registers. The structure that is used to complete the work is defined in the following files.
> ptrace.h The pt_regs structure that holds all registers is defined, and the PT_REGS structure instance that holds each register value is placed on the kernel stack when the process is switched from the user state to the kernel state.
> processor.h contains thread_struct struct , which describes all other registers and all other process state information.
> thread.h defines thread_info struct , which contains all the TASK_STRUCT members that must be accessed to enter and exit the kernel state and assemble the code.
The definition of Pt_regs under ARM architecture:
/* * This struct defines the registers is stored on the * stack during a system call. Note that sizeof (struct pt_regs) * Have to is a multiple of 8. */struct Pt_regs {long uregs[18];}; #define ARM_CPSRUREGS[16] #define ARM_PCUREGS[15] #define ARM_LRUREGS[14] #define ARM_SPUREGS[13] #define Arm_ipuregs[ #define ARM_FPUREGS[11] #define ARM_R10UREGS[10] #define ARM_R9UREGS[9] #define ARM_R8UREGS[8] #define Arm_r7uregs[7 ] #define ARM_R6UREGS[6] #define ARM_R5UREGS[5] #define ARM_R4UREGS[4] #define ARM_R3UREGS[3] #define Arm_r2uregs[2]# Define ARM_R1UREGS[1] #define ARM_R0UREGS[0] #define ARM_ORIG_R0UREGS[17]
thread_struct definition under ARM architecture (machine instructions can be stored in opcode form with memory addresses for debugging purposes):
Union DEBUG_INSN {u32arm;u16thumb;}; struct Debug_entry {u32address;union debug_insninsn;}; struct Debug_info {intnsaved;struct debug_entrybp[2];}; struct Thread_struct {/* fault info */unsigned longaddress;unsigned longtrap_no;unsigned longerror_code;/* debugging */struct Debug_infodebug;};
A. 8-bit operation and byte order
The kernel architecture-specific
bit operations are defined in arch/arch/include/asm/
bitops.h , which is the bit operation definition under the ARM architecture:
#ifndef __armeb__/* * These is the little endian, atomic definitions. */#define SET_BIT (nr,p) atomic_bitop_le (set_bit,nr,p) #define CLEAR_BIT (nr,p) Atomic_bitop_le (clear_bit,nr,p) # Define Change_bit (Nr,p) atomic_bitop_le (change_bit,nr,p) #define TEST_AND_SET_BIT (nr,p) Atomic_bitop_le (test_and_ SET_BIT,NR,P) #define TEST_AND_CLEAR_BIT (nr,p) atomic_bitop_le (test_and_clear_bit,nr,p) #define Test_and_change_bit (nr,p) Atomic_bitop_le (test_and_change_bit,nr,p) #define FIND_FIRST_ZERO_BIT (P,sz) _find_first_zero_bit_le (P,SZ) #define Find_next_zero_bit (P,sz,off) _find_next_zero_bit_le (p,sz,off) #define FIND_FIRST_BIT (P,sz) _find_first_bit_le (P,sz ) #define FIND_NEXT_BIT (P,sz,off) _find_next_bit_le (p,sz,off) #define WORD_BITOFF_TO_LE (x) ((x)) #else/* * These is the Big endian, atomic definitions. */#define SET_BIT (nr,p) atomic_bitop_be (set_bit,nr,p) #define CLEAR_BIT (nr,p) Atomic_bitop_be (clear_bit,nr,p) # Define Change_bit (Nr,p) atomic_bitop_be (change_bit,nr,p) #define TEST_AND_SET_BIT (nr,p) atomic_bitop_be (test_and_ Set_bit, nr,p) #define TEST_AND_CLEAR_BIT (nr,p) atomic_bitop_be (test_and_clear_bit,nr,p) #define TEST_AND_CHANGE_BIT (NR,P) Atomic_bitop_be (test_and_change_bit,nr,p) #define FIND_FIRST_ZERO_BIT (P,sz) _find_first_zero_bit_be (P,SZ) #define Find_next_zero_bit (P,sz,off) _find_next_zero_bit_be (p,sz,off) #define FIND_FIRST_BIT (P,sz) _find_first_bit_be (P,sz ) #define FIND_NEXT_BIT (P,sz,off) _find_next_bit_be (p,sz,off) #define WORD_BITOFF_TO_LE (x) ((x) ^ 0x18) #endif
The kernel provides
little_endian.h and Big_endian.h header files.
The version that is used for the current processor is contained in the
asm-arch/byteorder.h , and the small end format is converted as follows, big-endian similar:
#define __CONSTANT_HTONL (x) ((__force __be32) ___constant_swab32 ((x))) #define __CONSTANT_NTOHL (x) ___constant_swab32 ((__force __be32) (x)) #define __CONSTANT_HTONS (x) ((__force __be16) ___constant_swab16 ((x))) #define __CONSTANT_NTOHS ( x) ___constant_swab16 ((__force __be16) (x)) #define __CONSTANT_CPU_TO_LE64 (x) ((__force __le64) (__u64) (x)) #define __ CONSTANT_LE64_TO_CPU (x) ((__force __u64) (__le64) (x)) #define __CONSTANT_CPU_TO_LE32 (x) ((__force __le32) (__U32) (x)) # Define __CONSTANT_LE32_TO_CPU (x) ((__force __u32) (__LE32) (x)) #define __CONSTANT_CPU_TO_LE16 (x) ((__force __le16) (__ U16) (x)) #define __CONSTANT_LE16_TO_CPU (x) ((__force __u16) (__LE16) (x)) #define __CONSTANT_CPU_TO_BE64 (x) (__force __ BE64) ___constant_swab64 (((x))) #define __CONSTANT_BE64_TO_CPU (x) ___constant_swab64 ((__force __u64) (__be64) (x)) # Define __CONSTANT_CPU_TO_BE32 (x) ((__force __be32) ___constant_swab32 ((x))) #define __CONSTANT_BE32_TO_CPU (x) ___ Constant_swab32 ((__force __u32) (__BE32) (x)) #define __CONSTANT_CPU_TO_BE16 (x) ((__force __BE16) ___constant_swab16 (((x))) #define __CONSTANT_BE16_TO_CPU (x) ___constant_swab16 ((__force __u16) (__BE16) (x)) # Define __CPU_TO_LE64 (x) ((__force __le64) (__u64) (x)) #define __LE64_TO_CPU (x) ((__force __u64) (__le64) (x)) #define __ CPU_TO_LE32 (x) ((__force __le32) (__U32) (x)) #define __LE32_TO_CPU (x) ((__force __u32) (__LE32) (x)) #define __cpu_to_ Le16 (x) ((__force __le16) (__U16) (x)) #define __LE16_TO_CPU (x) ((__force __u16) (__LE16) (x)) #define __CPU_TO_BE64 (x) ((_ _force __be64) __swab64 ((x))) #define __BE64_TO_CPU (x) __swab64 ((__force __u64) (__be64) (x)) #define __CPU_TO_BE32 (x) (( __force __be32) __swab32 ((x))) #define __BE32_TO_CPU (x) __swab32 ((__force __u32) (__BE32) (x)) #define __CPU_TO_BE16 (x) ( (__force __be16) __swab16 ((x))) #define __BE16_TO_CPU (x) __swab16 ((__force __u16) (__BE16) (x))
a.9 Page Table
To simplify memory management, the kernel provides a memory model that abstracts different architectures from each other, and the various porting editions must
provide functions to manipulate page tables and page table entries . These statements are in
asm-arch/pgtable.h .
a.10 Miscellaneous
a.10.1 Calibration and calculation
calculating checksums on packets is the key to communicating over IP networks, which can be time consuming. If possible, each architecture should use manual assembly code to calculate checksums. The declaration of the related code is in
asm-arch/checksum.h . Of these, there are two function-type most important:
>
unsigned short ip_fast_csum calculates the necessary checks and the length of the packet based on the IP header and header lengths.
>
csum_partial calculates checksums for each packet, based on the individual shards received sequentially.
a.10.2 Context Switch
After the scheduler decides to notify the current process to discard the CPU so that another process runs, the hardware-related parts of the context switch are made. To do this, all architectures must provide a
switch_to function or a corresponding macro . The prototype is as follows, and the declaration is defined in
asm-arch/system.h .
/* * SWITCH_TO (prev, next) should switch from Task ' prev ' to ' next ' * ' prev ' would never be the same as ' next '. Schedule () itself * contains the memory barrier to tell GCC not to cache ' current '. */extern struct task_struct *__switch_to (struct task_struct *, struct thread_info *, struct thread_info *); #define Switch_ to (Prev,next,last) does {last = __switch_to (Prev,task_thread_info (prev), Task_thread_info (Next)) and} while (0)
a.10.3 Finding the current process
The current
macro is used to locate a pointer to the task_struct of the currently running process. Each architecture must declare the macro in
arch/arch/include/asm/current.h . The pointer is stored in a separate processor register and can be queried either directly or indirectly using the current macro. Most architectures use it to hold a pointer to the currently valid Thread_info instance, because the Thread_info struct contains a pointer to the task_struct,current of the related process that can be implemented around a bend. For example, the implementation of the ARM architecture:
Arch/arm/include/asm/current.h:
Static inline struct task_struct *get_current (void) __attribute_const__;static inline struct task_struct *get_current ( void) {return current_thread_info ()->task;} #define CURRENT (Get_current ())
arch/arm/include/asm/thread_info.h :
/* How to get the thread information struct from C */static inline struct thread_info *current_thread_info (void) __attri bute_const__;static inline struct thread_info *current_thread_info (void) {Register unsigned long SP asm ("SP"); return ( struct Thread_info *) (SP & ~ (thread_size-1));}