Note: Most of the articles in this classification from the "Deep analysis of the Linux kernel source code" a book, and other reference information such as "Complete analysis of the Linux kernel", "Linux C programming One-stop learning", and so on, just for better in the system programming and network programming some conceptual issues, And did not read the analysis of the source, I also hastily turned over the book, please have interested friends to refer to the relevant information. This book was published earlier and analyzed in version 2.4.16, so some of the concepts that appear may be different from the latest version of the kernel.
This book has been open source, read address http://www.kerneltravel.net
First, pipeline in Linux, pipelines are a very frequent communication mechanism. In essence, a pipeline is also a file, but it is different from the general file, the pipeline can overcome the use of files to communicate the two problems, the specific performance described below. limits the size of the pipe. In fact, a pipeline is a fixed-size buffer. In Linux, the size of the buffer is 1 pages, or 4KB, so that its size does not grow as much as the file does without testing. Using a single fixed buffer can also cause problems, such as when writing a pipeline, which can become full when this happens, then the write () call to the pipeline will be blocked by default, waiting for some data to be read so that there is enough space for write () to call. The read process may also work faster than the write process. When all current process data has been read, the pipeline becomes empty. When this happens, a subsequent read () call will be blocked by default, waiting for some data to be written, which resolves the issue where the read () call returns the end of the file. Note that reading data from a pipe is a one-time operation, and once the data is read, it is discarded from the pipeline, freeing up space to write more data. (i), pipeline structure in Linux, the implementation of pipelines does not use a dedicated data structure, but instead uses the file structure of the filesystem and the index node inode of the VFS. By pointing two file structures to the same temporary VFS index node, the VFS index node points to a physical page. As shown in 7.1. Figure 7.1 has two file data structures, but they define the file operation routine address is different, one of which is a routine address that writes data to the pipeline, and the other is a routine address that reads data from the pipeline. In this way, the system call of the user program is still the usual file operation, but the kernel uses this abstraction mechanism to realize the special operation of the pipeline. An ordinary pipe can only be shared between two processes with a common ancestor, and the ancestor must have established a pipeline for them to use. Note that the data in the pipeline is always read in the same order as the write data, which means that the Lseek () system call has no effect on the pipeline. Two, signal (i), signal in the kernel representation interrupt response and processing are in the kernel space, and the response of the signal occurs in the kernel space, signal processing program execution occurs in the user space. So, when does the signal be detected and responded to? This usually occurs in the following two cases: (1) The current process enters kernel space due to system calls, interrupts, or exceptions, before returning from kernel space to user space, (2) when the current process is awakened after entering sleep in the kernel, returning to the user in advance due to the presence of the detected signalSpace. When there is a signal to respond, the processor executes 33.2 of the route shown.
When a user process enters the kernel through a system call, the CPU automatically presses the content shown on the kernel stack of the process:
After the system call is processed, the do_signal () function is called to set up the frame and so on. At this point the kernel stack should be in the same state as the left half (the system call pushes some information into the stack): After the signal processing function is found, the do_signal () function first saves the EIP in the kernel stack that holds the return execution point as OLD_EIP, and then replaces the EIP with the address of the signal processing function. Then the kernel saved in the "original ESP" (that is, the user-state stack address) minus a certain value, the purpose is to enlarge the user-state stack, and then save the content on the kernel stack on the user stack, this process is to set the frame. It is worth noting that the following two points:
1, the value of the EIP is set to the address of the signal processing function, because once the process returns to the user state, it is necessary to execute the signal processing program, so the EIP to point to the signal processing program instead of the original should be executed address.
2, the reason to copy the frame from the kernel stack to the user stack, because the process from the kernel state returned to the user state will clean up the call to use the kernel stack (similar function call), the kernel stack is too small, can not simply save another frame on the stack (imagine nested signal processing), And we need to EAX (System call return value), EIP this information so that after the execution of the signal processing function can continue to execute the program, so copy them to the user state stack to save them.
When the process returns to user space, the signal processing function is executed based on the EIP value in the kernel stack. So, after the signal handler executes, how do you return to the program to continue execution? After the signal handler executes, the process calls the Sigreturn () system call back to the kernel again to see if there are any other signals to be processed, and if not, then the kernel will do some cleanup work, restore the previously saved frame to the kernel stack, and restore the EIP value to Old_eip. The user space is then returned, and the program continues to execute. At this point, the kernel passes through the signal processing once (or several times).
C + + Code
1 2 |
|
(by d efault, the signal handler is invoked on the Normal process stack. it is possible to arrange that the signal handler uses an alternate stack; see sigaltstack (2) For a discussion of how to do this and when it might be useful.) |
iii. IPC mechanism for system V to provide compatibility with other systems, Linux also supports 3 Systemⅴ interprocess communication mechanisms: Message, semaphore (semaphores) and shared memory, Linux The implementation of these mechanisms is similar. We refer to semaphores, messages, and shared memory collectively as objects of the system V IPC, each with the same type of interface, called System. Just as each file has an open file number, each object also has a unique identification number, the process can access these objects through the identification number passed by the system call, as well as the access to the files, access to these objects to verify access rights, System V IPC Access to these objects can be set on the creator of the object through system calls. In the Linux kernel, all objects of the System V IPC have a public data structure pc_perm structure, which is the permission description of the IPC object, defined in Linux/ipc.h as follows: c++ code
1 2 3 4 5 6 7 8 9 |
|
struct ipc_perm { key_t key; /* key & nbsp;*/ ushort uid; /* The valid user identification number and valid group identification number of the corresponding process for the object owner */ ushort gid; ushort cuid; /* The valid user identification number and valid group identification number of the corresponding process for the creator of the object */ ushort cgid; ushort mode; /* access mode */ ushort seq; /* serial number */ }; |
In this structure, the key is further explained. Keys and identifiers refer to different things. The system supports two types of keys: public and private. If the key is public, then all processes in the system can find the identification number of a system V IPC object after a permission check. If the key is private, the key value is 0, which means that each process can establish an object that is dedicated to its private use with the key value 0. Note that references to system V IPC objects are by identification number instead of by key. (a), the semaphore in Linux is implemented through a series of data structures provided by the kernel, these data structures exist in the kernel space, the analysis of them is to fully understand the semaphore and the use of signals to achieve inter-process communication basis, the following first gives the semaphore data structure (exists in the Include/linux /SEM.H) C + + Code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21st 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
|
(1) data structure of each semaphore in the system (SEM) struct SEM { int semval; /* Current value of semaphore */ unsigned short semzcnt; /* # Waiting for Zero */ unsigned short semncnt; /* # Waiting for increase */ int sempid; /* The process identification number of the last operation on the semaphore */ };
(2) Data structure representing semaphore set (set) in the System (SEMID_DS) struct SEMID_DS { struct Ipc_perm sem_perm; /* IPC Permissions */ Long Sem_otime; /* Time of the last semaphore operation (SEMOP) */ Long Sem_ctime; /* The time of the last modification of this structure */ struct SEM *sem_base; /* Pointer to the first semaphore in a semaphore array */ struct Sem_queue *sem_pending; /* Pending actions to be processed */ struct Sem_queue **sem_pending_last; /* Last pending action */ struct Sem_undo *undo; /* Undo request on this array */ UShort Sem_nsems; /* Semaphore number on the semaphore array */ };
(3) Queue structure for each semaphore set in the system (Sem_queue) struct Sem_queue { struct Sem_queue *next; /* The next node in the queue */ struct Sem_queue **prev; /* Previous node in queue, * (q->prev) = = Q */ struct Wait_queue *sleeper; /* The process of being asleep */ struct Sem_undo *undo; /* Undo Structure */ int pid; /* Process identification number of the request process */ int status; /* Completion Status of the operation */ struct Semid_ds *sma; /* Array of semaphore arrays in operation */ struct SEMBUF *sops; /* Array of pending operations */ int nsops; /* Number of operations */ }; |
C + + Code
1 2 3 4 5 6 |
|
struct sembuf { ushort sem_num; /* the index value of the semaphore in the array */ short sem_op; /* Semaphore Action Value (positive, negative, or 0) */ short sem_flg; /* operation flag for ipc_nowait or sem_undo*/ }; |
If the process is suspended, Linux must save the semaphore's operational state and put the current process into the waiting queue. To do this, the Linux kernel builds a sem_queue structure in the stack and populates the structure. The new sem_queue structure is added to the collection's wait queue (with the sem_pending and Sem_pending_last pointers). The current process is placed in the wait queue of the Sem_queue structure (sleeper) after the scheduler is called to select another process to run. When a process modifies the semaphore and enters the critical section, but does not exit the critical section because it crashes or is "killed", other processes that are suspended on the semaphore will never get the chance to run, which is known as a deadlock. Linux avoids this problem by maintaining a list of adjustments for a semaphore array (SEMADJ). The basic idea is that when these "adjustments" are applied, the state of the semaphore is returned to the state before the operation is implemented. (b), Message Queuing Linux messages can be described as an internal linked list in the kernel address space, each message queue is uniquely identified by an IPC identification number. Linux maintains a msgque linked list for all message queues in the system, and each pointer in the list points to a MSGID_DS structure that describes a message queue in its entirety. C + + Code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21st 22 23 24 25 26 27 28 29 30 31 32 33 |
|
(1) Message buffer (MSGBUF) /* message buffers used by MSGSND and MSGRCV system calls */ struct MSGBUF { Long Mtype; /* The type of message must be positive */ Char mtext[1]; /* Message body */ };
(2) message structure (MSG) struct MSG { struct MSG *msg_next; /* Next message on the queue */ Long Msg_type; /* Message type */ Char *msg_spot; /* Address of the message body */ Short Msg_ts; /* The size of the message body */ };
(3) Message queue structure (MSGID_DS) /* Corresponds to a MSGID_DS structure for each message queue in the system */ struct MSGID_DS { struct Ipc_perm msg_perm; struct MSG *msg_first; /* The first message on the queue, which is the list header */ struct MSG *msg_last; /* The last message in the queue, that is, the end of the list */ time_t Msg_stime; /* The time of the last message sent to the queue */ time_t Msg_rtime; /* Time of the last message received from Message Queuing */ time_t Msg_ctime; /* Last time the queue was modified */ UShort Msg_cbytes; /* Total number of bytes for all messages on the queue */ UShort Msg_qnum; /* The number of messages on the current queue */ UShort Msg_qbytes; /* The maximum number of bytes in the queue */ UShort Msg_lspid; /* PID of the process that sent the last message */ UShort Msg_lrpid; /* PID of the process receiving the last message */ }; |
(c), shared memory similar to Message Queuing and semaphore collections, the kernel maintains a special data structure Shmid_ds for each shared memory segment (in its address space), which is defined in include/linux/shm.h as follows: C + + Code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
/* There is a SHMID_DS data structure for each shared memory segment in the system. */ struct SHMID_DS { struct Ipc_perm shm_perm; /* Operation permissions */ int shm_segsz; /* Size of segment (in bytes) */ time_t Shm_atime; /* The time the last process was attached to the segment */ time_t Shm_dtime; /* The time the last process left the segment */ time_t Shm_ctime; /* The last time the structure was modified */ unsigned short shm_cpid; /* Create PID for this segment of the process */ unsigned short shm_lpid; /* PID of the last process to operate on this segment */ Short Shm_nattch; /* The number of processes currently attached to the segment */ /* The following is private */ unsigned short shm_npages; /* Size of segment (in pages) */ unsigned long *shm_pages; /* array of pointers to frames-SHMMAX struct Vm_area_struct *attaches; /* Description of Shared segment */ }; |
We use Figure 7.4来 to represent the data structure of shared memory shmid_ds relationships with other related data structures. A page exception is generated when a process accesses shared virtual memory for the 1th time. At this point, Linux finds the VM_AREA_STRUCT structure that describes the memory, which contains the address of the handler that is used to handle this shared virtual memory segment. Shared memory page exception handling code searches the Shmid_ds Page table Entry table to see if there is a table entry for the shared virtual memory. If not, the system assigns a physical page and establishes a page table entry that joins the SHMID_DS structure and also adds it to the page table of the process. This means that when a process tries to access this page of memory, there will be a fault, and the page exception handling code of the shared memory will give the process the newly created physical pages. Thus, the 1th process's access to shared memory causes a new physical page to be created, while the other process's access to shared memory causes the page to be added to their address space. When a process no longer shares its virtual memory, a system call is used to remove the shared segment from its own virtual address area and update the Process page table. When the last process frees the shared segment, the system frees the physical page assigned to the shared segment. When shared virtual memory is not locked to physical memory, shared memory may also be swapped into the swap area. Iv. Posix IPC-mechanism semaphores: divided into named and anonymous semaphores. Named semaphores are typically used between processes that do not share memory (kernel implementations), and anonymous semaphores can be used for thread communication (stored in thread-shared memory, such as global variables), or for interprocess communication (stored in process-shared memory, such as System V/posix shared memory). Message Queuing, shared memory: similar to System V. Mutex Mutex + anonymous semaphore: Thread communication Mutex Mutex + condition variable condition: Thread communication
Inter-process communication mechanism