Pipe and named pipe (FIFO) in Linux, that is, pipelines and named Pipelines

Source: Internet
Author: User

 

 

Pipeline is an important communication method in Linux. It connects the output of a program directly to the input of another program. It is often said that most pipelines refer to unknown pipelines, an unknown pipeline can only be used between unrelated processes, which is the biggest difference between it and a famous Pipeline.

A famous pipeline is named pipe or FIFO (first-in-first-out). It can be created using the mkfifo () function.

 

 

 

FIFO

FIFO is a named pipe or a famous pipe. For pipe, we can easily see that it can only be used for communication between one process family, between parent and child, and between brothers. If you want to make the MPs queue in a broader environment, it cannot be because it does not have a "name" or is anonymous, and other processes cannot see it. This gives you a named MPs queue. It is also based on VFS, and the corresponding file type is FIFO file. You can create a FIFO file on the disk using the mknod command (note: this is the essential difference between it and pipe, pipe exists completely in the memory and has no trace on the disk). When a process wants to communicate through the FIFO, it can open the file through the standard API and then start read/write operations. For read/write implementations of FIFO, it is the same as pipe. The difference is that the FIFO has the open operation, while pipe directly creates a file descriptor for communication when calling the pipe system call. In addition, the FIFO open operation should be carefully considered. For example, if the writer opens the file first and there are no readers, the communication is definitely unavailable, therefore, you need to first sleep and wait for the reader to open the FIFO, and vice versa.

 

 

Pipeline is an important communication method in Linux. It connects the output of a program directly to the input of another program. It is often said that most pipelines refer to unknown pipelines, an unknown pipeline can only be used between unrelated processes, which is the biggest difference between it and a famous Pipeline.

A famous pipeline is named pipe or FIFO (first-in-first-out). It can be created using the mkfifo () function.

Implementation Mechanism of Linux Pipelines

In Linux, pipelines are frequently used communication mechanisms. In essence, an MPS queue is also a file, but it is different from a common file. The MPs queue can overcome the following two problems:

· Limit the MPs queue size. In fact, a media transcoding queue is a fixed buffer. In Linux, the buffer size is one page, that is, 4 K bytes, so that the size of the buffer does not grow as untested as the file size. Using a single fixed buffer can also cause problems. For example, the pipeline may be full when it is written. When this happens, subsequent write () calls to the pipeline will be blocked by default, wait for some data to be read to free up enough space for write () calling and writing.

· The read process may also work faster than the write process. When data of all current processes has been read, the MPs queue becomes empty. When this happens, a subsequent read () call will be blocked by default, waiting for some data to be written, which solves the problem of returning the end Of the file after the read () call.

Note: Reading data from an MPS queue is a one-time operation. Once the data is read, it is discarded from the MPs queue and space is released to write more data.

1. Pipeline Structure

In Linux, the implementation of pipelines does not use a special data structure, but uses the file structure of the file system and the inode of the vfs index node. By directing the two file structures to the same temporary VFS index node, the VFS index node points to a physical page.

2. MPS queue read/write

The source code of pipeline implementation is in FS/pipe. c, in pipe. there are many functions in C, two of which are important, namely the pipeline READ function pipe_read () and pipeline write function pipe_wrtie (). The pipeline write function writes data by copying bytes to the physical memory pointed to by the VFS index node, while the pipeline READ function reads data by copying bytes in the physical memory. Of course, the kernel must use a certain mechanism to synchronize access to the pipeline. Therefore, the kernel uses locks, waiting queues, and signals.

When a write process writes data to a pipeline, it uses the standard library function write (). The system can find the file structure of the file based on the file descriptor passed by the library function. The file structure specifies the address of the function used for write operations (that is, the write function). Therefore, the kernel calls this function to complete write operations. Before writing data to the memory, The Write function must first check the information in the VFS index node and perform actual memory replication only when the following conditions are met:


· The memory has enough space to accommodate all the data to be written;

· The memory is not locked by the read program.


If both of the preceding conditions are met, the write function first locks the memory and then copies the data from the address space of the write process to the memory. Otherwise, the write process will sleep in the waiting queue of the VFS index node. Next, the kernel will call the scheduler, And the scheduler will choose other processes to run. The write process is in an interrupted wait state. When the memory has enough space to accommodate the written data or the memory is unlocked, the read process will wake up the write process, the write process receives the signal. When data is written to the memory, the memory is unlocked, and all reading processes that sleep on the index node are awakened.

The pipeline read process is similar to the write process. However, a process can return an error message immediately when there is no data or the memory is locked, rather than blocking the process, depending on the open mode of the file or pipeline. On the other hand, the process can sleep in the waiting queue of the index node and wait for the write process to write data. After all the processes complete the pipeline operation, the indexing node of the pipeline is discarded, and the shared data page is also released.

Because the implementation of pipelines involves a lot of file operations, when the reader finishes learning the content about the file system and then reads the code in pipe. C, you will feel that it is not difficult to understand.

The creation and use of Linux pipelines are simpler. The only reason is that it requires fewer parameters. To create the same MPs queue as windows, use the following code snippet for Linux and UNIX:

Create a Linux Named Pipe
Int fd1 [2];

If (pipe (fd1 ))

{Printf ("pipe () failed: errno = % d", errno );

Return 1;

}

The Linux pipeline has a limit on the size of the previous write operation to block. The kernel-level buffer specifically used for each pipeline is exactly 4096 bytes. Unless the reader clears the MPs queue, a write operation exceeding 4 kb will be blocked. In fact, this is not a limit, because read and write operations are implemented in different threads.

Linux also supports named pipelines. Early commentators suggested that I compare the Linux naming pipeline with the Windows naming pipeline for fairness. I wrote another program that uses Named Pipes in Linux. I found that for the named and unnamed pipelines in Linux, the results are no different.

Linux pipes are much faster than Windows 2000 named pipes, while Windows 2000 named pipes are much faster than Windows XP Named Pipes.


Example:
# Include <stdio. h>
# Include <unistd. h>


Int main ()
{
Int N, FD [2]; // The FD here is an array of file descriptors for creating pipelines.
Pid_t PID;
Char line [100];
If (pipe (FD) <0) // create an MPS queue
Printf ("pipe create error/N ");

If (pid = fork () <0) // use fork () to create a new process
Printf ("fork error/N ");

Else if (pid> 0) {// here is the parent process. First close the read end of the pipeline, and then write "Hello World" to the write end of the pipeline"
Close (FD [0]);
Write (FD [1], "Hello word/N", 11 );
}
Else {
Close (FD [1]); // This is a sub-process. First, close the write end of the pipeline, and then read the data from the read end of the pipeline.
N = read (FD [0], line, 100 );
Write (stdout_fileno, line, N );
}
Exit (0 );
}

 

Summary: An MPs queue consists of an unnamed MPs queue and a named MPs queue. An unnamed MPs queue does not belong to any file system and only exists in the memory, however, you can think of it as a special file. You can use the read () and write () functions of common files to operate pipelines,
A famous pipeline is famous and tangible. To use this pipeline, a special file system named "pipeline file" is set up in Linux, which exists in the file system, any process can access the MPs queue at any time through the path and file name of the famous MPs queue. However, the disk is only a node, and the file data only exists on the memory buffer page, which is the same as that of a common pipeline.

 

PIPE is the most classic means of inter-process communication in Linux. It is usually used to combine commands in a terminal, for example, "ls-L | WC-L ". Its function is intuitive, that is, to make the output of the previous process as the input of the next process, it is conceptually consistent with the meaning of "Pipeline.

 

Use pipelines to implement "ls-L | WC-l"

The sample code in scenario analysis is suitable for understanding pipelines. Assume that the process corresponding to the terminal is Pa, and WC and LS are two sub-processes child_ B and child_c created successively by PA. After the code is simplified, copy it as follows:
Int main (){
Int pipefds [2], child_ B, child_c;
Pipe (pipefds );
If (! (Child_ B = fork () {// first create one end of "read", it needs to close one end of "write"
Close (pipefds [1]);
Close (0 );
Dup2 (pipefds [0], 0); // After execve is called by the execution system, child_ B releases files opened by the parent process other than 0, 1 and 2,
Close (pipefds [0]); // copy pipefds [0] to the file handle 0 corresponding to the standard input.
Execl ("/usr/bin/WC", "-l", null );
} // After that, A and B can communicate through pipelines
Close (pipefds [0]);
If (! (Child_c = fork () {// create a "write" end. It must close the "read" end.
Close (1 );
Dup2 (pipefds [1], 1); // The same as before
Close (pipefds [1]);
Execl ("/bin/ls", "-1", null );
} // After that, B and C can communicate through pipelines
Close (pipefds [1]);
Wait4 (child_ B, null, 0, null );
Return 0;
}

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.