Linux kernel design and implementation reading notes

Last Update:2015-04-11 Source: Internet

Author: User

Tags mutex posix semaphore cpu usage

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Chapter III Process Management

1. The fork system call returns two times from the kernel: one return to the child process, one return to parent process 2. The TASK_STRUCT structure is allocated using the slab allocator, which was placed at the bottom of the kernel stack 2.6 ago; the task_struct of all processes together form a doubly linked list 3. The core stack of the 2.6 kernel is a thread_info structure with pointers to task_struct and 4. The current macro can find the task_struct;x86 of the currently process by first locating the thread_info structure, while the PPC is having a dedicated register current task_struct (R2) 5. The kernel stack size is generally 8kb6. Five states of the process: Task_running, task_interruptable, task_uninterruptable (the process in progress cannot be killed because it may be waiting for critical data, or holding a semaphore, etc.), task_ Traced (tracking status by other processes,The specific state of the performance is unclear), task_stopped (Receive sig_stop signal, stop process, equivalent to pause process, but can also recover) 7. The run context is divided into "process context" and "interrupt context." When the system calls the kernel to execute code on behalf of the process in the context of the process, the current macro is valid, pointing to the task_struct of the process, and the page table used by the kernel when the system is called is the page table of the user-state process, while the interrupt context core does not represent any process execution code. Instead of executing an interrupt handler, there is no process to intervene with these interrupt contexts, so the process context does not exist at this time. 8. The system call should be in the interrupt context at the moment the kernel is stuck, because it is a soft interrupt, and the process context 9 is used only after the kernel is trapped. In the TASK_STRUCT structure of each process, there is a parent pointer to its parents, and a linked list represents all its child processes, which make up the entire system process tree. 10. The kernel's two-way list-specific struct list_head11. Process creation is divided into two steps: fork and exec,fork are used to create the structure of the process, through write-time replication, the parent-child process to share the process space (page table), the parents process and the child process is only the difference between pid,ppid, some resources, statistics (task_struct structure); Exec reads out the program code and executes it. With write-time replication, you create your own process address space for a child process only if you need to write to the process address space. The overhead of fork is actually the process of copying the page table of the parent process and assigning the task_struct structure fork to the child process: fork (), Clone (), Do_fork (), copy_process () Copy_proc ESS () process: a) allocates a kernel stack for a child process, creates a thread_info structure, creates a copy of the same task_struct structure B as the parent process, changes thread_info, task_struct some fields in the structure, separates the child process from the parent process C Set the status of the child process to task_uninterruptibled) to assign an available PID (Alloc_pid ()) e) to the child process to copy or share the parent process open file, signal processing function, process address space, etc. f) Set child process status to Runningg) Returns a pointer to a child process the general system wakes the child process first, because if the parent process is first awakened, the parent process is likely to write, which triggers a write-time copy, and the child process typically calls EXEC&NBSP;12. Vfork vfork ensure that the parent process is createdAfter the child process is blocked, unless the child process executes exec, or the child process exits vfork the advantage of having write-time copy function in Fork is that the vfork does not copy the child process Page Table13. Thread thread implementation in the Linux kernel is a process, but the thread will share the process address space with other threads, share the signal and so on to create a process with four threads, four processes will be created (four kernel stacks and four task_struct), as long as they indicate these task_ A struct in which the same process address space is shared can be compared to the code on which the thread is created and the process is created: Thread creation: Clone (CLONE_VM | Clone_fs | Clone_files | Clone_sighand, 0) process creation: Clone (Sig_child, 0) The code created from the thread can also be seen: thread creation is the process address space (CLONE_VM) of the shared parent process, open file (clone_files), File system Information ( CLONE_FS), Signal processing function (Clone_sighand) 14. Kernel thread kernel processes need to perform some operations in the background, so it is necessary to create some kernel threads (kernel thread) kernel threads and normal threads, the main difference is that the kernel thread does not have a separate address space (the MM pointer in task_struct is NULL), share the page table with the kernel state Kernel threads run only in the kernel, never dispatched to the user state, and the kernel thread and normal thread have the same point: the same state, is dispatched, can also be preempted by what kernel threads: FLUSH,KSOFTIRQD, etc., with PS-EF can be viewed, where the cmd bar is [] All of this is the kernel thread creation method: Kthread_create (); wake_up_process () is used to wake up the created thread, Kthread_run () can be created and run, kthread_stop () stops kernel thread 15. Orphan Process If the parent process finishes before the child process, the parent process wants to find a new process in the thread group as the parent of the child process, or to find the Init process as the parent of the child process 16. The process extinction process is accomplished by exit (), and the kernel retains its task_struct structure until its parent process calls wait () or waitpid () to recycle The fourth chapter process dispatch 1. Linux is a preemptive multi-tasking system 2. By selecting a process from the scheduler to execute, the scheduler decides when to suspend a process so that other processes are allowed the opportunity, and this suspend operation is called preemption. 3. The time that a process can be run before being preempted is called the time slice of the process. The time slice of the process is fixed and pre-set. 4. Yield (), the process can pass the function to the dispatch right 5. scheduling algorithm a) O (1) Scheduler O (1) scheduler is ideal for large server workloads, but not ideal in interactive scenarios B) The CFS fully fair scheduling algorithm improves the Linux less than 6 of interactive scenes. IO consumption process and CPU consumption process the Linux scheduler is usually more inclined to prioritize the IO consumption process, but does not ignore the CPU consumption process 7. Process priority Linux uses two methods to represent process priorities: a) Nice value, nice value would have been a standard practice for UNIX. In Linux,The nice value represents the ratio of the time slice, the higher the Nice value, the lower the priority, the range is 20 to 19b) real-time priority, the range is 0 to 99, the higher the priority 8. Time slices are too long, resulting in poor support for IO-consuming processes; time slices are too short, process scheduling takes more time to 9. The CFS scheduling algorithm actually gives each process a processor occupancy ratio, and this occupancy ratio will also be affected by NICE's example: Assume that the system has only two processes, a text-editing program (IO consumption), a video codec (CPU consumption), the system initially they have the same nice value, So the processing time assigned to them after startup is the same, 50%, because the text editor consumes very little CPU, so its CPU time is much less than the 50% that should be allocated to him, and the video program takes up more than 50% CPU time, so when the text editing program needs to run, The scheduler found that its CPU time was much less than it deserved, so immediately let him preempt the run, when the text editor is finished, and then into the wait, so it consumes less CPU time, so that the system can constantly respond to text editing programs. The main idea of CFS scheduling algorithm is to ensure the fair use of the system, through this method can automatically find the CPU usage of each process, according to the use of dynamic adjustment process scheduling and allocation. The minimum value of the CFS is 1ms for the time slices that can run before each process is preempted.The problem is that a long-running IO-consuming process, compared to a CPU-consuming process that is just starting to run, may slow down the IO-consuming process being dispatched, so is it a reboot if the process executes too long? Ten. Linux has a variety of scheduler algorithms different processes are grouped into different scheduler classes schedule () Select a highest priority process from the highest priority scheduler to dispatch a fully fair dispatch CFS is a scheduler for ordinary processes, called Sched_ in Linux. Normal also has a real-time process Schedulerwhen did Linux run the scheduler?a) b) Linux is through the need_resched to indicate whether to perform a schedule, where the flag will be set: Schedule_tick (), try_to_wake_up (), etc. need_ The resched flag is stored in the thread_info of the process because access to current is faster than accessing the global variable C) when the user space is returned or the interrupt returns, the kernel also checks the need_resched flag, if set, The system will call dispatcher D before it continues to run: D.1) user preemption d.1.1) when returning user space from system call d.1.2) when returning user space from the Interrupt Service program D.2) kernel preemption d.2.1) when returning kernel space from the Interrupt Service program d.2.2) Once again, the kernel code has the ability to preempt: it contains the following meanings: Only the process does not hold the lock can be preempted, if the lock is held, the system is not preempted,when the lock is releasedand Preempt_count reduced to 0, indicating that the current can be safely preempted, whenCheck the need_resched flagto preempt. d.2.3) The kernel explicitly calls the schedule () d.2.4) kernel task Block Scheduler entry: the schedule () function, which is the role of selecting a highest-priority process from the highest-priority scheduler to dispatch 12, sleep and wake up when the process is waiting to change its process state to interruptiable or uninterruptiable state, and move itself out of the scheduled red-black tree to the waiting queue, and then call Schedule () Schedule the next process to suspend the process to the appropriate waiting queue when you are running sleep: define_wait (Wait), Add_wait_queue (Q, &wait), while (!condition){prepare_to_wait (&q, &wait, task_interruptiable);
if (signal_pending (current)) ...
schedule ();}finish_wait (&q, &wait); Move yourself out of the waiting queue Wake upThe wake_up () function wakes up all the processes that are hung on the wait queue, changes the status of these processes to task_running, and adds it to the scheduled red-black tree, setting the Need_ if the awakened process priority is higher than the current priority. Reschedule logo . Wake-up Note: There may be spurious wakes , and the process may be awakened by a signal being received. Therefore, when waiting, a while loop is used to check whether the condition is satisfied, and if it is not satisfied it may be a false wake, wait must continue. 13. preemption and Context switching context switching is the switch from one process to another, complete with the Context_switch () function, which is called in Schedule (), which mainly accomplishes two tasks: switch_mm () : Switch the virtual address space of the process switch_to (): Switch the processor state of the process, save, restore stack information and register information, and any other architecture-related status information 14. Real-time Scheduler two real-time scheduling strategies: Sched_fifo and Sched_rrsched_fifo: FIFO, always execute, until they release the CPU or wait, no time slice concept SCHED_RR: similar to Sched_fifo, but there is a time slice concept, Re-dispatched after exhausting a pre-allocated time sliceHow do I set a process as a real-time process? The fifth chapter of system call 1. What is a system call? Why introduce system calls? A system call is a middle tier between a user process and a hardware device that introduces a system call for three reasons: a) to provide users with a unified, abstract interface and hardware equipment to deal with B) through the system call this middle layer, to prevent user abnormal operation of hardware device C) The idea of virtualization, User processes are run as separate entities in the virtual space, and the same layer of interface between the system and the user process is also for this consideration, similar to the installation of multiple virtual machines on a single hardware 2. API, POSIX, C library POSIX is a set of generic API interface standard C libraries that implements the majority of the API user-state invocation process specified by POSIX: application-C Library--system call Linux system calls are also provided as part of the C library 3. Unix interface design of the famous saying "provide a mechanism rather than a policy"-Meaning: The system call abstracts functions that are used to accomplish some definite purpose, and how these functions are concerned with the application and the C library because they do not require kernel care at all. In fact, the design of any API has such a requirement: only the interface to complete a specific task, how to use this API is the user to care about the difference between the mechanism and strategy will simplify the development,mechanism is "what functionality is required" and the strategy is "how to implement these features"。 This allows the same API to be used to accommodate different requirements. 4. Syscall Tablesys_call_table holds the handler function 5 for all system call numbers. Interrupts fall into a) through the soft interrupt interrupt number is 128int $0x80 b) Sysenter instruction x86 provides a new way to enter system calls, faster and more professional 6. The return value of the system call and errno each system call will have a return value, the return value is generally a long type, 0 for success, negative for the failure, and the return value in addition to the success of the failure, according to the specific implementation of the system call, you can return the functional results, such as getpid () The system call returns an error number stored in the Piderrno global variable, which can be described by perror ().How is errno used as a global variable on multicore? 7. System call numbers and parameters are required to pass system calls when calling parameters and return values. The system call number is always passed with eax, when the number of parameters is less than 5, with the Register pass (EBX, ECX, edx, EDI, ESI), when more than 5, a single register should be used to hold a pointer to all parameters of the user space address of the return value is passed through EAX 8. Data copy of user space and kernel space copy_to_user (DST, SRC, size); Copy_from_user (DST, SRC, size); in fact, direct copy is also possible. These two functions are mainly to add some use checks, to check the user-supplied pointers, do not let the user space through the system call to operate the address of the kernel space note that Copy_to_user and copy_from_user can cause blocking, this happens when the data is swapped to the hard disk. At this point, the process sleeps until it wakes up and resumes execution or calls dispatcher 9. System call to do a lot of checking work, because the input from the user state, can not let the user state error operation caused the kernel state data error capable () function can do some permission check 10. System calls are sleeps and can be preempted Sleep ensures that the system calls can use most of the kernel interface 11. Functionre-entry accessibilitySystem calls are guaranteed to be reentrant when implemented, because the system calls are allowed to be preempted, so it is guaranteed to be reentrant when a new process calls the system call. 12. Do not rely on the C library support, directly using the system call method for example: Using the open system call # define Nr_open 5_syscall3 (long, open, const char *, filename, int, flags, int, mode) _syscall3 is a macro that sets the register and calls into the command. Through this macro creates an open () function, the return value is a long, there are three parameters, then you can directly use long FD = open (filename, flags, mode); The calling system called 13. It is best not to add new system calls, but instead to use some alternativesAlternative Scenarios:A) for device nodes, you can use the IOCTL custom command to operate B) for semaphores This is also a file descriptor, so you can also use the/proc or/SYSFS file system to interact with the kernel IOCTLC The sixth chapter key kernel data structure 1. List, queue, map, binary tree 2. Linked List Classic List_head circular doubly linked list struct list_head{struct list_head *next; struct list_head *prev;}; CONTAINER_OF macro, list_entry macro, can be easily found by this macro list_head the structure of the first address offset_of (type, member): Get member in the type structure offset offset, CONTAINER_OF This macro # offsetof (Struct_t,member) ((size_t) (char *) & ((struct_t *) 0)->member) #define Container_of (PTR, type, member) ({\
Const typeof (((type *) 0)->member) *__mptr = (PTR); \
(Type *) ((char *) __mptr-offsetof (Type,member));}) #define List_entry (PTR, type, member)/
Container_of (PTR, type, member) init_list_head (struct list_head); Initialize List_head chain header: List_head () operation Method: List_add (new, head), List_add_tail (new, head), List_del (PTR), list_del_init (PTR); Delete and initialize the List_headlist_move (list, head); Deletes a list from a linked list and adds it to the head list behind the list of List_move_tail (lists, head); Remove List_empty (head) from a linked list; Determines whether the list is empty traversal: struct fox{int i; struct list_head *list;}; struct List_head *head = ...; struct Fox *f;struct List_head *p;List_for_each(P, head) {//loop traversal list f = list_entry (P, struct fox, list);} Another macro: List_for_each_entry (f, head, list); Can implement the same function as above list_for_each{} The macro is declared as follows:List_for_each_entry(POS, head, member); There is also a reverse traversal: list_for_each_entry_reverse (POS, head, member), and if the traversal is to be deleted at the same time:List_for_each_entry_safe(POS, Next, head, member); One more next struct list_head pointer, used to record Nextlist_for_each_entry_safe_reverse (POS, Next, head, member); If you want to operate the linked list concurrently, you must use a lock. 3. Queue FIFO. Producer Consumer Model Kfifo is the universal implementation of the kernel. Create queue: struct Kfifo fifo;int ret;int size = Page_size;ret = Kfifo_alloc (&fifo, size, gfp_kernel); Size must be a power of 2 char buffer[page_size];kfifo_init (&AMP;FIFO, &buffer, size); Into the queue: unsigned int kfifo_in (struct kfifo *fifo, const void *from, unsigned int len); Copy from the Len-sized data to the FIFO out queue: unsigned int kfifo_out (struct kfifo *fifo, void * to, unsigned int len); Copy the Len-sized data to the other operations in to: see Book 4. Mapping is similar to Std::map, with the following actions: Add, Remove, Findlinux implements a dedicated map-like structure: a unique UID to a pointer mapping 5. Two-fork TreeRbtree (look at the principle of red and black trees)Find operations can be used in many cases, if the search is less than the linked list Seventh interrupt and interrupt processing 1. The interrupt context, also known as the original context, is not blocked in 2. After the kernel receives the interrupt to set the device's register to shut down interrupts, the device configuration space generally has the interrupt bit, uses the level interrupt mode, must set this interrupt bit to 3. Upper and lower half for example, the network card, the upper half of the interrupt after receiving the interrupt context to set the hardware register, while the data quickly copied to the kernel space 4. REQUEST_IRQ Registration Interrupt irqf_shared flag: Shared interrupt line irqf_disabled flag: Turn off other interrupts when processing interrupts FREE_IRQ () release Interrupt 5. The interrupt context is process-independent and cannot sleep interrupt context stacks: There are two separate interrupt stacks per CPU, or the kernel stack of the kernel stack that uses the interrupted process is two pages, 8KB on a 32-bit machine, and 16KB 6 on a 64-bit machine. /proc/interrupts 7. Interrupt control disables interrupts: Local_irq_disable () local_irq_enable (); These two functions are defective: they cannot be nested, so there are two functions: Local_irq_save (Flags) local_irq_ Restore (flags) they are savedInterrupt status (the state that was previously disabled or enabled)These functions are used as critical block locks, which are used to act as locks when the soft interrupt context and the interrupt context have shared data .Local_irq_disable () prohibits all interrupts from being disabled on the specified IRQ: DISABLE_IRQ (IRQ) Disable_irq_nosync (IRQ) ENABLE_IRQ (IRQ) SYNCHRONIZED_IRQ ( IRQ) These functions can be nested, so how many times disable is called to call the Enable shared interrupt line IRQ is not forbidden, so these APIs are mainly used on older devices, PCIe devices forced to share interrupt line 8. Determines whether the current interrupt context In_interrupt (): The kernel is executing an interrupt handler or the lower half returns non-0IN_IRQ (): The kernel is returning non-0 9 when the interrupt handler is executing. Interrupt handlers can only run on one CPU The eighth chapter of the lower half 1. What is the lower half of the lower half is a slightly slower task than the interrupt handler, which can be processed after the interrupt handler finishes processing the most urgent task 2. The lower half can have a variety of implementation methods the upper half is implemented only with interrupt handlers, while the lower half can be implemented in the following ways: Soft interrupt, Tasklet, Task Force column 3. Soft interrupt system can register up to 32 soft interrupts, the current kernel total soft interrupt has 9 one soft interrupt can not preempt another soft interrupt. The only thing that can preempt a soft interrupt is an interrupt handler. However, other soft interrupts, even the same type of soft interrupts, can be run concurrently on other processors. Typically, the interrupt handler marks its soft interrupt before it returns, so that it executes at a later time. This step is called triggering a soft interrupt. So, when will the soft interrupts that are to be processed be performed?a) When the interrupt returnsb) in the KSOFTIRQD kernel threadc) In the code that displays the check and execution of the soft interrupt to be processed, such as in the network subsystemDO_SOFTIRQ () is a function that wakes up a soft interrupt, and its simplified code is as follows: Currently only two subsystems use soft interrupts directly: the network subsystem and the SCSI subsystem; Tasklet is also a registered soft interrupt handler implemented with soft interrupts: Open_softirq (softirq_no , Softirq_handler); Raise_softirq (softirq_no); Soft interrupts cannot sleep and can be preempted by interrupt handlers. If the same soft interrupt is triggered again at the same time as it is executed, the handler can be executed concurrently on the other CPU, which means that the lock is used to protect the context of the soft interrupt, but if locking, the use of soft interrupts is not significant. So a soft interrupt is generally usedSingle processor Data(data belonging to only one processor). So soft interrupt as BH use less, generally use tasklet. 4. Tasklettasklet is also implemented with a soft interrupt, with two soft interrupts and Tasklet Related: HI_SOFTIRQ, tasklet_softirqtasklet has two single-processor data structures: Tasklet_vec and Tasklet_hi_vectasklet can guarantee that only one tasklet in a given category will be executed at the same time, and that different classes of tasklet can be executed simultaneously, So using Tasklet can be used without too much consideration of the problem of locking Tasklet: declaration: (Name is the category of Tasklet) Declare_tasklet (name, func, data);D eclare_tasklet_ DISABLED (name, funct, data); //declares the format of the default disable Tasklet tasklet handler: void Tasklet_handler (unsigned Long data) Schedule your own defined Tasklettasklet_schedule (&my_tasklet); Enable and disable tasklettasklet_disable (&my_tasklet); //If the specified Tasklet is executing, wait until execution ends and return tasklet_enable (&my_tasklet); a tasklet is always executed on the CPU that dispatched it, This is hoping to make better use of the processor's cache. A tasklet is executed on one CPU, and it is not executed on the other CPU at the same time (because it is possible that the interrupt handler activates the Tasklet on the other CPU while the Tasklet is executing). This will not be performed on the other CPU if the Tasklet is found to be executing 5. Soft interrupt scheduling and ksoftirqd/n kernel threads because soft interrupts can be re-triggered by themselves, if soft interrupts are constantly triggered soft interrupts, and the soft interrupt is immediately checked execution, it will cause the system CPU is too much to consume the soft interrupt. Another scenario is that a soft interrupt that is self-triggered is not checked for execution immediately, but is checked for execution after the next interrupt handler returns. ksoftirqd/n is a kernel thread on each CPU, as long as there is an unhandled soft interrupt in the idleThe CPU will be scheduled to execute. 6. Work queueThe Task force lists the work to a kernel thread to execute, always executing in the context of the process。 andsoft interrupts and tasklet may be executed in the interrupt context, or may be executed in the process context。 Because the process context is always executed, the work queue can be re-dispatched or even asleep. If you want to allow rescheduling in the lower part, you can use the work queue. Worker threads: Events/nworker_thread () is the core function: The following is the relationship between Run_workqueue () Workqueue_struct, cpu_workqueue_struct, and Work_struct Use of the Work queue: declare_work (name, void (*FUNC) (void *), void *data); Statically created init_work (struct work_struct *work, void (*func) (void *), void *data); Dynamic creation of work queue handlers void Work_handler (void * data); Dispatching work execution schedule_work (&work); Schedule_delayed_work (&work, delay);problem: There is a events/n kernel thread on each CPU, so what is Schedule_work (&work) on the events list on which CPU? 7. Prohibit the lower half of the local_bh_disable () local_bh_enable () The two functions can be nested, that is, disable several times need to enable several times, because these two functions use the Preempt_ Count (preemption count) to record the reference count The Nineth chapter is the kernel Synchronization 1. Cause of concurrency (a) interrupt B) soft interrupt and TASKLETC) kernel preemption: Kernel tasks do not preempt D) sleep and synchronization with user space: Kernel task preempted by user process E) SMP CPU: Multiprocessor parallel executionuser-state processes generally only need to consider two factors: SMP CPU and process preemptionSMP CPUs may be executed in parallelprocess preemption requires code support to re-enter Both cases are mainly afraid of two processes (threads) simultaneously access to global data or shared data, if two processes (threads) do not share data then it is certainly a security code, if you want to share data, you must lock2. Interrupt security Code, SMP security Code, preemption security Code (can re-enter code) 3. As long as the sharing of data, it is necessary to lock, so try not to share data; (Personal idea: When you want to enter IO consumable code, you can consider sharing data, for example: When you want to access the database, because the database must have IO operation, multiple threads to access the same database connection to lock, because the process to switch , and a database connection can also be used for caching. If multiple database connections are used, then the cache becomes a problem) 4. Remember: Lock the data and don't lock the code 5. 6. Deadlock deadlock Cause: Multiple processes, multiple resources; multiple processes hold a portion of the resources and request additional resources; The results wait for each other. 7. Lock contention and lock granularity height contention locks are prone to system bottlenecks and lock granularity: fine-grained locks can reduce lock contention The tenth chapter is the kernel synchronization method 1. atomic_t Atomic Operation 2. Spin lock Define_spinlock (Mr_lock); Static initialization of Spin_lock_init (&mr_lock); Dynamic initialization of Spin_lock (&mr_lock); Spin_unlock (&mr_lock); Note: The interrupt context is to be used with Spin_lock_irqsave (&mr_lock, flags); Spin_unlock_irqrestore (&mr_lock, flags); If it can be determined that the interrupt is activated before the lock is added, the following APISPIN_LOCK_IRQ (&mr_lock) can be used: SPIN_UNLOCK_IRQ (&mr_lock); Non-blocking Operation Spin_try_lock (&mr_lock); spin_is_locked (&mr_lock); Spin lock and lower half, in the lower part to use the following API:SPIN_LOCK_BH () Spin_unlock_bh ()Note:If the lower half and the process context share data, use this to lock if the lower half and the interrupt context share data, it is necessary to use spin_lock_irqsave/spin_unlock_irqrestore read-write spin lock Define_rwlock ( Lock) read_lock/read_unlockwrite_lock/write_unlock 3. Semaphore (semaphore) Declare_mutex (name) sema_init (SEM, count) Init_mutex (SEM) init_mutex_locked (SEM) acquisition signal Volume: Down_ Interruptible () If not available, the function sets the process state to task_interruptiable and goes to Sleep Down_trylock () non-blocking get Semdown () cannot be obtained when entering non-disruptive sleep (task_uninterruptiable) up () wake Read/write Semaphore 4. Mutex (mutex) count is 1 semaphore Define_mutex (mutex) mutex_init (&mutex) mutex_lock (&mutex) mutex_unlock (&mutex) Comparison: semaphores and mutexes: A mutex in a general application, unless the mutex does not meet the demand semaphore and spin Lock: The interrupt context can only be 5 with a spin lock. The idea of completion variables is similar to the semaphore, just a solution for a simpler problem init_completion (struct completion *); Wait_for_completion () complete () 6. The large kernel lock Lock_kernel () Unlock_kernel () uses as little as 7. Sequential lock is used in certain cases, usually write very little, read a lot of time, and data operation is better 8. No preemption of Preempt_disable () preempt_enable () preempt_enable_no_resched () Preempt_count () 9. Memory barrier because the compiler reads and writes the read-write reordering, the memory barrier is added to ensure that read-write order Common on SMP, when multiple CPUs operate the same data, one CPU may appear to have been written, but the other CPU reads the original value,often appear in multi-threaded shared data without locking the caseRMB () Ensure that read operations across RMB () are not reordered WMB () Ensure that writes across WMB () are not reordered MB () to ensure that read and write operations across megabytes () are not reordered SMP_RMB ()/SMP_WMB ()/SMP_MB () The 12th chapter memory Management 1. Allocation and release of the entire page 2. The Kmalloc and slab Kmalloc are implemented based on the slab allocator. The slab allocator mainly solves the memory fragmentation caused by allocating irregular bytes of memory, as well as speeding up the allocation and release time of Memory Kmem_cache_create () 3. Vmallocvmalloc and Kmalloc are similar, allocating physical memory, but the physical and virtual memory allocated by Kmalloc must be contiguous, while the Vmalloc allocated virtual memory is contiguous, physical memory may be discontinuous, kmalloc performance is better, Because Vmalloc must create a dedicated page table entry, and Kmalloc because it is mapped directly, there is no need for the page table (TLB: Cache virtual Address to the physical address of the cache) so vmalloc in the kernel is generally only used when allocating large chunks of memory, such as when loading a module vfree ( void *) Whether Vmalloc allocated memory 4. Memory management on the stack 32-bit and 64-bit architecture page size is 4KB and 8KB, the general kernel process stack is 2 pages, so generally 8KB and 16KB do not allocate a large amount of data on the kernel stack, easy to cause stack overflow 5. High-end memory mapping using alloc_pages () allocates high-end memory to return the page* structure, because high-end memory does not have a direct logical address mapping, so to create a page table to map persistent mappings: Kmap/kunmap (May sleep) temporary mappings: (Used without sleep) Kmap_atomic/kunmap_atomic (In fact, there is a set of reserved mapping addresses) 6. selection of allocation functions Generally with Kmalloc and _get_free_pages if you want to allocate from high-end memory, with Alloc_pages + Kmap if the allocated memory is large, and does not need to be physically continuous, use Vmalloc if you want to do distribution management, use the slab allocator The 13th Chapter virtual file System 1. VFS File System Abstraction Layer 2. Unix four kinds of traditional abstract concept files related to file systems, directory entries, index nodes (inode, stored file metadata), Mount node in addition, the file system control information is placed in the super block = = =

=========
3. Super Block Super Block record file system basic information the operation of the Super block is mainly crud Inode 4. Inodeinode stores all information about a file or directory: several times, reference count, Uid,gid, file size, permissions an inode represents a file: it can be an ordinary file, or it can be a pipe, block device, character device, etc. inode operations: CRUD files, modifying permissions , Truncate,mkdir, Mknod, rename, etc. through the inode you can find the Dentry object 5. The Dentry directory entry object directory and normal files are a directory entry object that is primarily used to parse the file path in the directory entry structure to maintain the entire file tree Dentry operation: Determine whether the directory object is valid, compare file names, find files, etc. 6. File object File object holds file information: file path, file operation file_operations, file offset, page cache address file operation: Read/write/lseek/ioctl/fsync/open/close/mmap ...

Linux kernel design and implementation reading notes

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Linux kernel design and implementation reading notes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Linux kernel design and implementation reading notes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support