Linux Kernel Design and Implementation of Reading Notes, Linux Kernel

Last Update:2015-04-11 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Chapter 3 Process Management

1. fork system calls are returned twice from the kernel: one is returned to the sub-process and one is returned to the parent process. the task_struct structure is allocated by the slab distributor. 2.6 was previously placed at the bottom of the stack of the kernel stack. The task_struct of all processes are connected together to form a two-way linked list 3. 2.6 The Kernel stack base of the kernel is in the thread_info structure, where there is a pointer to task_struct; 4. the current macro can find the task_struct of the current process; X86 is to find the thread_info structure first, while PPC has a dedicated register to store the current task_struct (R2) 5. the kernel stack size is generally 8 kb6. five States of a process: TASK_RUNNING, TASK_INTERRUPTABLE, and TASK_UNINTERRUPTABLE (a process in the modified state cannot be killed because it may be waiting for critical data or holding a semaphore ), TASK_TRACED ), TASK_STOPPED (receives the SIG_STOP signal, stops the process, which is equivalent to pausing the process, but can also be restored) 7. The running context is divided into "process context" and "interrupt context ". During system calling, the kernel indicates that the process executes code in the context of the process. The current macro is valid and points to the task_struct of the process. The page table used by the kernel during system calling is the page table of the User-state process; the interrupt context kernel does not mean that any process executes the code, but executes an interrupt processing program without interfering with the interrupt context. Therefore, there is no process context at this time. 8. the system call should interrupt the context when it falls into the kernel, because it is a soft interrupt, but after it falls into the kernel, it uses the process context 9 again. in the task_struct structure of each process, there is a parent pointer pointing to its parent process, there is a linked list to represent all its sub-processes, using this structure to form the entire system process relationship tree. 10. the dedicated structure of the two-way list of the kernel, struct list_head11. process Creation is divided into two steps: fork and exec. fork is used to create the structure of the process. During write replication, the Parent and Child processes share the process space (page table ), the difference between a parent process and a child process is only PID, PPID, some resources, and Statistics (task_struct structure). exec reads and executes program code. During write replication, a process address space is created for the child process only when the process address space needs to be written. * ** The overhead of fork is actually the process of copying the parent process's page table and assigning the task_struct structure to the child process fork: fork ()-> clone ()-> do_fork () -> copy_process () process: a) allocates a kernel stack for the sub-process, creates a thread_info structure, and creates a task_struct structure that is the same as that of the parent process. B) Change thread_info, some fields in the task_struct structure are separated from the parent process. c) set the sub-process status to TASK_UNINTERRUPTIBLEd and assign an available PID (alloc_pid () e to the sub-process) copy or share files opened by the parent process, signal processing functions, process address space, etc. f) set the sub-process status to RUNNINGg) returns a pointer to a sub-process. Generally, the system will first wake up the sub-process. If the sub-process is awakened first, the parent process may write data. This will trigger replication during write, the child process generally calls exec 12. vfork ensures that the parent process is blocked after the child process is created, unless the child process executes exec, or the child process exits vfork after the copy function is enabled when the fork has write, there is only one benefit, namely: vfork does not need to copy the sub-process's page table 13. the implementation of a thread in the Linux kernel is a process, but the thread shares the process address space and signals with other threads to create a process with four threads, there will be four processes created (four kernel stacks and four task_struct), as long as you specify that these task_struct can share the same process address space to create a thread and a process to create a code comparison: thread creation: clone (CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGHAND, 0) Process Creation: clone (SIG_CHILD, 0) the Code created by the thread can also be seen: when a thread is created, it shares the process address space (CLONE_VM) of the parent process, opened files (CLONE_FILES), file system information (CLONE_FS), and signal processing functions (CLONE_SIGHAND) 14. the kernel process of the kernel thread needs to perform some operations in the background. Therefore, the main difference between the kernel thread and the common thread is: the kernel thread does not have an independent address space (the mm pointer in task_struct is NULL) and shares the page table using the kernel state. The kernel thread runs only in the kernel and never schedules it to the user State; kernel threads and common threads share the same characteristics: they are in the same status, scheduled, and can be preemptible. Which kernel threads are flush, ksoftirqd, etc. You can view them using ps-ef, the CMD column is [] expanded by the kernel thread creation method: kthread_create (); wake_up_process () is used to wake up the created thread; kthread_run () can be created and run; kthread_stop () stop the kernel thread 15. if the parent process is executed more than the child process, the parent process should find a new process in the thread group as the parent process of the child process, or find the init process as the parent process of the child process 16. the process is terminated by exit (). After exit, the kernel retains its task_struct structure until its parent process calls wait () or waitpid () for recycling. Chapter 4 process scheduling 1. linux is a preemptible multitasking system 2. the scheduler selects a process to execute. The scheduler determines when to suspend the running of a process so that other processes can be allowed. This pending operation is called preemption. 3. The time that a process can run before it is preemptible is called the time slice of the process. The time slice of the process is fixed and preset. 4. yield (), the process can transfer the scheduling permission through this function. 5. scheduling Algorithm a) O (1) scheduling program O (1) The scheduling program is ideal for applications under large server workloads, but not for interactive Scenarios B) the CFS fully fair scheduling algorithm improves linux's lack of interactive scenarios. 6. i/O consumable processes and CPU consumable processes linux scheduling programs usually prefer to prioritize the scheduling of I/O consumable processes, but do not ignore the CPU consumable Process 7. process Priority Linux uses two methods to indicate the process priority: a) nice value, nice value is a standard Unix practice. In linux, the nice value represents the ratio of time slices. The higher the nice value, the lower the priority. The range is-20 to 19b.) The real-time priority ranges from 0 to 99, higher priority 8. the time slice is too long, leading to poor support for IO consumable processes. If the time slice is too short, process scheduling will take more time 9. the CFS scheduling algorithm actually distributes the CPU usage ratio to each process, which is also affected by nice. Example: assume that the system has only two processes, one text editing program (IO consumable ), A video codec Program (CPU consumable) has the same nice value at the beginning of the system, so the processing time allocated to them after startup is the same, all of which are 50%, because the text editor consumes a small amount of CPU, the CPU time is much less than 50% of the CPU time that should be allocated to it, and the video program consumes more than 50% of the CPU time, so when the text editor needs to run, the scheduler finds that it has much less CPU time than it deserves, so it immediately needs to run it; when the text editor is completed, wait again, so it consumes less CPU time, In this way, the system can continuously respond to the text editing program immediately. The main idea of the CFS scheduling algorithm is to ensure the fair use of the system. This method can be used to automatically detect the CPU usage of each process, dynamically adjust the scheduling and allocation of processes based on the usage. The minimum value of the time slice that can run before each process is preemptible is 1 ms. The problem is that an I/O consumable process that has been running for a long time may slow down the scheduling of the I/O consumable process compared to a CPU consumable process that has just started, so do you want to restart the process if it takes too long to execute? 10. linux has a variety of scheduler algorithms. Different processes are classified as schedule () in different scheduler classes () select a process with the highest priority from the scheduler of the highest priority to schedule completely fair scheduling CFS is a scheduler for common processes. In linux, SCHED_NORMAL is also known as SCHED_NORMAL and real-time process scheduler 11. when does the scheduler run in Linux? A) B. in linux, The need_resched flag indicates whether a scheduling task is to be executed. The flag schedule_tick (), try_to_wake_up (), and so on are set; the need_resched flag is stored in thread_info of the process because accessing current is faster than accessing global variables. c) when returning the user space or interrupting the return, the kernel will also check the need_resched flag. If it is set, the system will call the scheduler d) the preemption occurrence time before it continues to run: d.1) User preemption d.1.1) when the system calls and returns the user space d.1.2) d.2) kernel preemption when a service interruption program returns a user space (d.2.1) when the service interruption program returns a kernel space (d.2.2) the kernel code is once again preemptible: this includes the following meanings: A process can be preemptible only if it does not hold a lock. If it holds a lock, the system cannot be preemptible. When the lock is released and the value of preempt_count is reduced to 0, description when Can be safely preemptible, then check the need_resched flag for preemptible. D.2.3) kernel explicitly calls schedule () d.2.4) kernel task blocking scheduler entry: schedule () function, the role is to select a process with the highest priority from the highest priority to Schedule 12, sleep and wake up when the process is waiting to change its Process status to INTERRUPTIABLE or UNINTERRUPTIABLE status, in addition, he moved himself out of the scheduling redblack tree to the waiting queue, and then called schedule () to schedule the next process to schedule the process to the corresponding waiting queue when running sleep: DEFINE_WAIT (wait ); add_wait_queue (q, & wait); while (! Condition) {prepare_to_wait (& q, & wait, TASK_INTERRUPTIABLE );
If (signal_pending (current ))...
Schedule ();} finish_wait (& q, & wait); // remove yourself from the waiting queue to wake up the wake_up () function to wake up all processes hanging on the waiting queue, change the status of these processes to TASK_RUNNING and add it to the redtree of scheduling. If the priority of the Awakened process is higher than the current priority, set the need_reschedule flag. Note that false wakeup may occur, but the process may have been awakened by a signal. Therefore, when waiting, use a while loop to check whether the conditions are met. If the conditions are not met, it may be false wake-up, and wait must be continued. 13. preemptible and context switch the context switch is to switch from one process to another, which is completed by using the context_switch () function. This function is called in schedule (). This function mainly completes two tasks: switch_mm (): Switch the virtual address space of the process to switch_to (): Switch the processor status of the process, save and restore stack information and register information, any other status information related to the architecture 14. two real-time scheduling policies for the Real-Time Scheduler: SCHED_FIFO and SCHED_RRSCHED_FIFO: first-in-first-out, always executed until you release the CPU or wait. No time slice concept SCHED_RR: similar to SCHED_FIFO, but has the concept of time slice, after exhausting the time slice pre-allocated to it, how does one set the process to be a real-time process? Chapter 5 System Call 1. What is system call? Why should we introduce system calls? System calling is an intermediate layer between user processes and hardware devices. There are three reasons for introducing system calling: a) providing users with a unified and abstract interface to deal with hardware devices B) the system calls this middle layer to prevent users from operating hardware devices abnormally. c. The user processes run in virtual spaces as separate entities, this is also the reason for providing such an interface between the system and the user process. It is the same as installing multiple virtual machines on a hardware. the posix api, POSIX, and C library POSIX are a set of General API Interface Standard C libraries that implement the call process of most API user States stipulated by POSIX: application> C library> System Call Linux system call is also provided as part of C library 3. the famous saying "providing mechanisms rather than policies" in Unix interface design means that system calls abstract functions used to accomplish a specific purpose. As for how to use these functions, the kernel does not need to be concerned, applications and libraries C. In fact, there is a need to design any API: Only interfaces for specific tasks are provided. How to use this API is a different mechanism and strategy that users are concerned about, which simplifies development, the Mechanism is "What functions are required", and the policy is "How to implement these functions ". In this way, the same API can be used to meet different needs. 4. syscall tablesys_call_table stores the processing functions of all system call numbers. interrupt into a) the Soft Interrupt interrupt number is 128int $0x80 B) sysenter command x86 provides a new way to access the system call, faster and more professional 6. system Call return values and errno each system call will have a return value. The return value is generally of the long type. 0 indicates success, and negative indicates failure. The return value not only indicates successful failure, according to the specific implementation of the System Call, function results can be returned. For example, if getpid () is called, the error number is returned in the global variable piderrno. The error description can be obtained through perror. How can I use errno as a global variable on multiple cores? 7. When passing system call parameters and return values, the system call number and parameters must be passed. The system call number is always transmitted using eax. When the number of parameters is smaller than five, it is transferred using registers (ebx, ecx, edx, edi, esi). When there are more than five, A separate register should be used to store the pointer return value of the user space address pointing to all parameters. The value is transmitted through eax 8. copy Data in user space and kernel space copy_to_user (dst, src, size); copy_from_user (dst, src, size); in fact, direct copy is also possible. These two functions are mainly used with some usage checks to check the pointer provided by the user, so that the user space is not allowed to operate the address of the kernel space through system calls. Note that both copy_to_user and copy_from_user may cause blocking, this happens when the data is switched to the hard disk. At this time, the process will sleep until it is awakened and continues to execute or call the scheduler 9. the system call requires a lot of checks. Because the input comes from the user State, the capable () function cannot cause incorrect operations on the user State to cause kernel state data to perform some permission checks 10. system calls are sleep and preemptive sleep, ensuring that most kernel interfaces can be used for system calls. 11. function reentrant system calls must be guaranteed to be reentrant during implementation, because system calls allow preemption, therefore, when a new process also calls the system call, the system must be reentrant to ensure that no error occurs. 12. without the support of the C library, you can directly use the method called by the system, for example, using the open system call # define NR_open 5_syscall3 (long, open, const char *, filename, int, flags, int, mode) _ syscall3 is a macro. It sets registers and calls commands that fall into the State. This macro creates an open () function. The return value is long and there are three parameters. In this case, you can directly use long fd = open (filename, flags, mode ); the called system calls 13. it is best not to add new system calls, but to use some alternative solutions: a) for device nodes, you can use the ioctl custom command to operate B) for semaphores, this is actually a file descriptor, so you can also use ioctlc) to use/proc or/sysfs file system to interact with the kernel. Chapter 6 key kernel data structure 1. linked List, queue, ing, binary tree 2. the classic list_head circular bidirectional chain table of the linked list, struct list_head {struct list_head * next; struct list_head * prev ;}; container_of macro, list_entry macro, you can use this macro to find the first address offset_of (type, member) of the structure where list_head is located: Get the offset of member in the type structure, this macro is used in container_of # define offsetof (struct_t, member) (size_t) (char *) & (struct_t *) 0)-> member) # define container_of (ptr, type, member )({\
Const typeof (type *) 0)-> member) * _ mptr = (ptr );\
(Type *) (char *) _ mptr-offsetof (type, member);}) # define list_entry (ptr, type, member )/
Container_of (ptr, type, member) INIT_LIST_HEAD (struct list_head); // initialize the list_head linked list header: LIST_HEAD () Operation Method: list_add (new, head); list_add_tail (new, head); list_del (ptr); list_del_init (ptr); // Delete and initialize the list_headlist_move (list, head); // Delete the list from a linked list, add list_move_tail (list, head) to the head linked list; // Delete list_empty (head) from a linked list; // determine whether the list_empty (head) Table is empty or not: struct fox {int I; struct list_head * list;}; struct list_head * head = ...; struct Fox * f; struct list_head * p; list_for_each (p, head) {// The cyclic traversal chain table f = list_entry (p, struct fox, list);} another macro: list_for_each_entry (f, head, list); can implement the same functions as the list_for_each {} block above. The macro declaration is as follows: list_for_each_entry (pos, head, member ); there is also a reverse traversal: list_for_each_entry_reverse (pos, head, member); and The list_for_each_entry_safe (pos, next, head, member) to be deleted when traversing ); // a next struct list_head pointer is added to record nextlist_for_each_entry _ Safe_reverse (pos, next, head, member); to operate the linked list concurrently, you must use a lock. 3. Queue FIFO. The producer consumer model kfifo is a common implementation of the kernel. Create a queue: struct kfifo; int ret; int size = PAGE_SIZE; ret = kw.o_alloc (& fifo, size, GFP_KERNEL ); // The size must be 2 power char buffer [PAGE_SIZE]; k1_o_init (& fifo, & buffer, size); inbound queue: unsigned int k1_o_in (struct kfifo * fifo, const void * from, unsigned int len); // copy the len-sized data from to the fifo queue: unsigned int kw.o_out (struct kfifo * fifo, void *, unsigned int len); // copy the data with the length of len to other operations: see Book 4. the std ing is similar to std: map. You can perform the following operations: Add, Remove, and Findlinux to implement a dedicated map-like structure: it is a unique UID ing between UID and a pointer. binary Tree rbtree (let's take a look at the principles of the red and black trees) can be used when there are many search operations. If there are few searches, it is better to use a linked list. Chapter 7 interrupt and interrupt handling 1. the interrupt context is also called the original context, which cannot be blocked. after the kernel receives the interrupt, it must set the device register to disable the interrupt. The device configuration space generally has the interrupt bit. The level interrupt mode must be used. 3. the upper half and lower half, for example, the NIC and the upper half, must set the hardware register after the interruption enters the interrupt context, and quickly copy the data to the kernel space. request_irq registration interrupt IRQF_SHARED flag: Shared disconnection IRQF_DISABLED flag: Disable other interruptions when handling the interruption free_irq () Release interrupt 5. the interrupt context is irrelevant to the process and cannot be sleep. There are two types of stacks: Each cpu has a separate interrupt stack; or use the kernel stack of the interrupted process. Generally, the kernel stack of the process is two pages, 8 kb for 32-bit machines and 16 KB for 64-bit machines. /proc/interrupts 7. interrupt control prohibit interruption: local_irq_disable () local_irq_enable (); these two functions have defects: they cannot be called by nesting, so there are two functions: local_irq_save (flags) local_irq_restore (flags) they will save the interrupt status (that is, the previously disabled or enabled status). These functions are used as critical zone locks. When Soft Interrupt context and interrupt context share data, these functions are used to lock local_irq_disable (). All interrupts are prohibited. disable_irq (irq) disable_irq_nosync (irq) enable_irq (irq) interrupt (irq) is prohibited) these functions can be nested, so the number of disable calls must be the number of enable shared disconnection irq cannot be disabled, so these APIs are mainly used on older devices, PCIe devices require shared disconnection. determine whether the current interrupt context is in_interrupt (): When the kernel is executing the interrupt handler or the lower half, return non-0in_irq (): When the kernel is executing the interrupt handler, return non-0 9. the interrupt handler can only run on one CPU. Chapter 8 lower part 1. what is the lower half? The lower half is a task that is slower than the interrupt handler. It can be processed after the interrupt handler completes the most urgent task. 2. the lower half can have multiple implementation methods. The upper half can only be implemented by the interrupt handler, while the lower half can be implemented by the following methods: Soft Interrupt, tasklet, and work queue. A Soft Interrupt System can register up to 32 soft interrupts. Currently, the total number of soft interrupts used by the kernel is 9. One Soft Interrupt cannot seize another. The only thing that can seize soft interruptions is the interrupt processing program. However, other soft interrupts, even for the same type of soft interrupts, can run simultaneously on other processors. Generally, the interrupt handler will mark its soft interrupt before returning so that it can be executed later. This step triggers a Soft Interrupt. So when will the Soft Interrupt be processed? A) B) IN THE ksoftirqd kernel thread c) in the displayed Soft Interrupt check and Execution Code, such as do_softirq () in the network subsystem () it is a function to wake up Soft Interrupt. Its Simplified code is as follows: currently, only two subsystems use Soft Interrupt directly: The network subsystem and the SCSI subsystem; tasklet is also the registration Soft Interrupt Processing Program implemented by Soft Interrupt: open_softirq (softirq_no, softirq_handler); raise_softirq (softirq_no); soft interruptions cannot be sleep and can be preemptible by interrupt handlers. If a Soft Interrupt is triggered again when it is executed, its processing program can be executed on other CPUs at the same time, which means that the Soft Interrupt should be protected by a lock in the context, however, if the lock is applied, it is of little significance to use Soft Interrupt. Therefore, the Soft Interrupt generally uses single processor data (only data of a certain processor ). Therefore, Soft Interrupt is rarely used as BH, and tasklet is generally used. 4. tasklettasklet is also implemented with Soft Interrupt. There are two soft interruptions related to tasklet: HI_SOFTIRQ and TASKLET_SOFTIRQtasklet have two single processor data structures: tasklet_vec and tasklet_hi_vectasklet ensure that only one tasklet of a given category is executed at the same time, and tasklet of different categories can be executed at the same time, therefore, tasklet can be used without too much consideration of the lock issue. tasklet usage: Declaration: (name is the tasklet category) DECLARE_TASKLET (name, func, data); DECLARE_TASKLET_DISABLED (name, funct, data); // declare the format of the default disable tasklet handler: void tasklet_handler (unsigned long data) Schedule Your Own defined taskl Ettasklet_schedule (& my_tasklet); Enable and disable tasklettasklet_disable (& my_tasklet); // if the specified tasklet is being executed, wait until the execution ends and return tasklet_enable (& my_tasklet ); A tasklet is always executed on the CPU for scheduling. This is to make better use of the processor's high-speed cache. A tasklet can be executed on only one CPU, it will not be executed on another CPU at the same time (because a tasklet may interrupt the processing program and activate the tasklet on another CPU at the same time, in this way, if the tasklet is being executed on another CPU, it will not be executed again.) 5. the Soft Interrupt scheduling time and ksoftirqd/n kernel threads can be re-triggered by themselves due to soft interruptions. Therefore, if soft interruptions are continuously triggered, and soft interruptions are immediately checked and executed, this will cause the system's CPU to be used up too much by soft interruptions. Another solution is to re-trigger the Soft Interrupt not to be checked immediately but to be executed after the next Interrupt Processing Program returns. Ksoftirqd/n is a kernel thread on each cpu. If there is an unhandled Soft Interrupt, it will be scheduled to run on the idle CPU. 6. The work queue assigns the work to a kernel thread for execution, which is always executed in the context of the process. The Soft Interrupt and tasklet may interrupt context execution or process context execution. Because it is always executed in the process context, the work queue can be rescheduled or even sleep. If you want to allow rescheduling in the lower half, you can use a work queue. Worker thread: events/nworker_thread () is the core function: run_workqueue () workqueue_struct, cpu_workqueue_struct, and the use of work_struct's relationship working queue: DECLARE_WORK (name, void (* func) (void *), void * data); // create INIT_WORK (struct work_struct * work, void (* func) (void *), void * data) statically ); // The void work_handler (void * data) function for dynamically creating a work queue; schedule_work (& work) for scheduling; schedule_delayed_work (& work, delay) for scheduling; problem: each CPU has an events/n kernel thread, so schedu When le_work (& work) is applied, which CPU is attached to the events list? 7. disable the lower half of local_bh_disable () local_bh_enable (). The two functions can be nested, that is, enable is required several times for disable; this is because these two functions use preempt_count to record reference counts. Chapter 9 kernel synchronization 1. cause of concurrency a) interrupt B) Soft Interrupt and taskletc) kernel preemption: Kernel tasks should not be preemptible d) sleep and synchronization with user space: Kernel tasks should be preemptible by user processes e) smp cpu: multi-processor parallel execution of user-State processes generally only needs to consider two factors: smp cpu and process preemption smp cpu may execute process preemption in parallel. Code support and re-import are required. Both cases are mainly because two processes (threads) access global data or share data at the same time, if two processes (threads) do not share data, it is certainly a secure code. If you want to share data, you must lock it. interrupt security code, SMP security code, and preemptible Security Code (you can re-import the Code) 3. as long as it is shared data, it is necessary to lock; so try not to share data; (personal idea: When you want to enter the IO consumable code, you can consider sharing data, for example, when you want to access the database, because I/O operations are required to access the database, and multiple threads need to lock the connection to the same database, because the process needs to be switched, and a database connection can also be used for caching. If multiple database connections are used, the cache becomes a problem.) 4. remember: Lock data. Do not lock code. 6. the cause of the deadlock: multiple processes and multiple resources; multiple processes hold a part of the resources respectively and request other resources; the results wait for each other. 7. Lock contention and lock granularity high contention locks can easily cause system bottlenecks and lock granularity: fine-grained locks can reduce lock contention Chapter 10 kernel synchronization method 1. atomic_t atomic operation 2. spin lock DEFINE_SPINLOCK (mr_lock); // statically initialize the spin_lock_init (& mr_lock); // dynamically initialize the spin_lock (& mr_lock );... spin_unlock (& mr_lock); note: In the interrupt context, use spin_lock_irqsave (& mr_lock, flags );... spin_unlock_irqrestore (& mr_lock, flags); if you can determine that the interrupt is activated before the lock is applied, you can use the following APIspin_lock_irq (& mr_lock );... trim (& mr_lock); non-blocking operation: spin_try_lock (& mr_lock); spin_is_locked (& mr_lock); spin lock and lower half. Use the following API in the lower half: spin_lock_bh () spin_unlock_bh () note: If the lower half shares data with the process context, use this lock. If the lower half shares data with the interrupted context, use the spin_lock_irqsave/spin_unlock_irqrestore to read and write the spin lock DEFINE_RWLOCK (lock) read_lock/read_unlockwrite_lock/write_unlock 3. semaphores (semaphore) DECLARE_MUTEX (name) sema_init (sem, count) init_MUTEX (sem) init_MUTEX_LOCKED (sem) Obtain semaphores: down_interruptible () if not, this function sets the Process status to TASK_INTERRUPTIABLE, and enters the sleep down_trylock () non-blocking to get the semdown () not available when the non-interrupted sleep (TASK_UNINTERRUPTIABLE) up () to wake up the read/write semaphores. mutex (mutex) Counts 1 semaphores DEFINE_MUTEX (mutex) mutex_init (& mutex) mutex_lock (& mutex) mutex_unlock (& mutex) Comparison: semaphores and mutex: generally, mutex is used in applications, unless the mutex cannot meet the requirement. semaphore and spin lock: the interrupt context can only use spin lock 5. the idea of variable completion is similar to that of semaphores, but it is only a solution for simpler problems. init_completion (struct completion *); wait_for_completion () complete () 6. large kernel lock_kernel () unlock_kernel () should be used as little as possible 7. sequence locks are used in specific cases. Generally, there are few writes. When reading a lot of data, it is better to operate the data easily. preempt_disable () preempt_enable () preempt_enable_no_resched () preempt_count () 9. the memory barrier is added to the memory barrier to ensure that the read/write order is common in SMP. When multiple CPUs operate on the same data, one CPU may have been written, however, the other CPU reads the original value, which often occurs when multiple threads share data without locking. RMB () ensures that the data is distributed across RMB () read operations are not reordered wmb () to ensure that write operations that span wmb () are not reordered mb () to ensure that the write operations span mb () read/write operations of smp_ RMB ()/smp_wmb ()/smp_mb () Chapter 1 Memory Management 1. Allocation and release of the entire page 2. kmalloc and slab kmalloc are implemented based on slab allocator. The slab splitter mainly solves the memory fragmentation caused by the allocation of irregular bytes of memory, as well as accelerating the memory allocation and release time kmem_cache_create () 3. vmallocvmalloc and kmalloc are similar in allocating physical memory, but the physical memory and virtual memory allocated by kmalloc must be continuous, while the virtual memory allocated by vmalloc is continuous, physical memory may be discontinuous. In addition, kmalloc provides better performance because vmalloc must have a special page table item, and kmalloc does not need a page table because it is directly mapped. (TLB: cache the high-speed cache from virtual addresses to physical addresses. Therefore, vmalloc is generally used only when a large block of memory is allocated in the kernel. For example, vfree (void *) is used when a module is loaded *) whether the memory allocated by vmalloc is 4. the page size of the 32-Bit Memory Management and 64-bit architecture on the stack is 4 kb and 8 kb respectively. Generally, the stack size of the kernel process is 2 pages, therefore, 8 KB and 16 KB do not allocate a large amount of data on the kernel stack, which may cause stack overflow. the high-end memory ing uses the high-end memory allocated by alloc_pages () to return the page * structure. Because there is no direct logical address ing in the high-end memory, you need to create a page table to map the permanent ing: kmap/kunmap (possibly sleep) Temporary ing: (used when sleep is disabled) kmap_atomic/kunmap_atomic (actually there is a set of reserved ing addresses) 6. the allocation function usually uses kmalloc and _ get_free_pages. If you want to allocate resources from the high-end memory, use alloc_pages + kmap. If the allocated memory is large and physical continuity is not required, use vmalloc. If you want to manage the allocation, use slab allocator. Chapter 1 Virtual File System 1. VFS File System Abstraction Layer 2. four types of traditional abstract concept files, directory items, index nodes (inode, stored file metadata) related to file systems in Unix, and mount nodes in addition, the control information of the file system is placed in the super block ===excerpt ====

Most UNIX file system types have similar generic structures, even if some details change. Its central concept is superblock, I node inode, data block, directory block ,. The super block contains the overall information about the file system on the hard disk or partition, such as the size of the file system (its accurate information depends on the file system. The I node contains all information about a file except the name. The name and the number of I nodes exist in the directory. The directory entries include the file name and the number of I nodes of the file. The I node contains the number of data blocks used to store file data. I nodes only have a small number of data blocks. If more data blocks are required, the pointer space directed to the data blocks is dynamically allocated. These dynamically allocated blocks are indirect blocks. in order to locate the data blocks, this name indicates that it must first find the numbers of the indirect blocks.

==========
3. the basic information of the file system is recorded in the super block. The operation of the super block is mainly CRUD inode 4. inodeinode stores all the information of a file or directory: a few times, reference count, uid, gid, file size, and an inode permission indicates a file: it can be a common file, it can also be inode operations such as pipelines, Block devices, and character devices: CRUD files, modification permissions, truncate, mkdir, mknod, rename, etc. Through inode, you can find dentry objects. the dentry directory item object directory and the common file are both a directory item object. It is mainly used to parse the file path directory item structure and maintain the entire file directory tree dentry operation: To determine whether the directory object is valid, compare file names and find files. 6. the object stores the file information: file path, file operations file_operations, file offset, and page cache address file operations: read/write/lseek/ioctl/fsync/open/close/mmap...

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Linux Kernel Design and Implementation of Reading Notes, Linux Kernel

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Linux Kernel Design and Implementation of Reading Notes, Linux Kernel

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support