Implementation Mechanism of Linux Pipelines
In Linux, pipelines are frequently used communication mechanisms. In essence, an MPS queue is also a file, but it is different from a common file. The MPs queue can overcome the following two problems:
· Limit the MPs queue size. In fact, a media transcoding queue is a fixed buffer.In Linux, the buffer size is one page, that is, 4 K bytes, so that the size of the buffer does not grow as untested as the file size.. Using a single fixed buffer can also cause problems. For example, the pipeline may be full when it is written. When this happens, subsequent write () calls to the pipeline will be blocked by default, wait for some data to be read to free up enough space for write () calling and writing.
· The read process may also work faster than the write process. When data of all current processes has been read, the MPs queue becomes empty. When this happens, a subsequent read () call will be blocked by default, waiting for some data to be written, which solves the problem of returning the end Of the file after the read () call.
Note::Reading data from an MPS queue is a one-time operation. Once the data is read, it is discarded from the MPs queue, releasing space to write more data.
1. Pipeline Structure
In Linux, the implementation of pipelines does not use a special data structure, but uses the file structure of the file system and the inode of the vfs index node. By directing the two file structures to the same temporary VFS index node, the VFS index node points to a physical page. 7.1.
Figure 7.1 MPs queue Structure
Figure 7.1 contains two file data structures, but they define different file operation routine addresses, one of which is the routine address for writing data to the pipeline, the other is the routine address for reading data from the pipeline. In this way, the system calls of the user program are still common file operations, but the kernel uses thisAbstraction MechanismThis special operation is implemented for pipelines.
2. MPS queue read/write
The source code of pipeline implementation is in FS/pipe. c, in pipe. there are many functions in C, two of which are important, namely the pipeline READ function pipe_read () and pipeline write function pipe_wrtie (). The pipeline write function writes data by copying bytes to the physical memory pointed to by the VFS index node, while the pipeline READ function reads data by copying bytes in the physical memory. Of course, the kernel must use a certain mechanism to synchronize access to the pipeline. Therefore, the kernel uses locks, waiting queues, and signals.
When a write process writes data to a pipeline, it uses the standard library function write (). The system can find the file structure of the file based on the file descriptor passed by the library function. The file structure specifies the address of the function used for write operations (that is, the write function). Therefore, the kernel calls this function to complete write operations. Before writing data to the memory, The Write function must first check the information in the VFS index node and perform actual memory replication only when the following conditions are met:
· The memory has enough space to accommodate all the data to be written;
· The memory is not locked by the read program.
If both of the preceding conditions are met, the write function first locks the memory and then copies the data from the address space of the write process to the memory. Otherwise, the write process will sleep in the waiting queue of the VFS index node. Next, the kernel will call the scheduler, And the scheduler will choose other processes to run. The write process is in an interrupted wait state. When the memory has enough space to accommodate the written data or the memory is unlocked, the read process will wake up the write process, the write process receives the signal. When data is written to the memory, the memory is unlocked, and all reading processes that sleep on the index node are awakened.
The pipeline read process is similar to the write process. However, a process can return an error message immediately when there is no data or the memory is locked, rather than blocking the process, depending on the open mode of the file or pipeline. On the other hand, the process can sleep in the waiting queue of the index node and wait for the write process to write data. After all the processes complete the pipeline operation, the indexing node of the pipeline is discarded, and the shared data page is also released.
Because the implementation of pipelines involves a lot of file operations, when the reader finishes learning the content about the file system and then reads the code in pipe. C, you will feel that it is not difficult to understand.
From: http://oss.org.cn/kernel-book/ch07/7.1.1.htm