Reprinted: http://www.ibm.com/developerworks/cn/linux/l-pipebid/
Linux provides popen and pclose functions (1) for creating and disabling pipelines to communicate with another process. The interface is as follows:
FILE *popen(const char *command, const char *mode);int pclose(FILE *stream); |
Unfortunately, the pipeline created by popen can only be one-way-the mode can only be "R" or "W", but not a combination-the user can only choose to either write in, you can either read from it, but cannot read or write in a single pipeline at the same time. In practical applications, there are often requirements for reading and writing at the same time. For example, we may want to send text data to the sort tool for sorting and then retrieve the results. In this case, the popen cannot be used. We need to find other solutions.
One solution is to use the pipe function (2) to create two unidirectional pipelines. The code without error detection is as follows:
Int pipe_in [2], pipe_out [2]; pid_t PID; pipe (& pipe_in); // create the pipeline pipe (& pipe_out) used to read data in the parent process ); // create the pipe used to write data in the parent process if (pid = fork () = 0) {// sub-process close (pipe_in [0]); // close (pipe_out [1]) The read end of the sub-process that closes the read pipeline of the parent process; // shut down the child process dup2 (pipe_in [1], stdout_fileno) of the write pipeline of the parent process ); // copy the read pipeline of the parent process to the standard output dup2 of the child process (pipe_out [0], stdin_fileno ); // copy the writing pipeline of the parent process to the standard input close (pipe_in [1]) of the child process; // close the copied read pipeline close (pipe_out [0]); // close the copied write pipeline/* execute the command using exec */} else {// close the parent process (pipe_in [1]); // close the write end of the read pipeline (pipe_out [0]); // close the read end of the write pipeline/* now you can write data to pipe_out [1, and read the result from pipe_in [0] */close (pipe_out [1]); // close the write pipeline/* read the remaining data in pipe_in [0] */close (pipe_in [0]); // close the read pipeline/* use wait functions to wait for the sub-process to exit and obtain the exit code */} |
Of course, this code is less readable (especially after the error processing code is added) and cannot be encapsulated into functions similar to popen/pclose, making it easy for high-level code to use. The reason is that a pair of file descriptors returned by the pipe function can only be read from the first and written to the second (at least for Linux ). In order to read and write data at the same time, we can only use the cumbersome two pipe calls and two file descriptors.
Back to Top
A better solution
This is the only way to use pipe. However, Linux implements a socketpair call from BSD (3 ), you can implement the Read and Write Functions in the same file descriptor (this call is currently part of the POSIX specification (4 )). This system call can create a pair of connected (UNIX) unknown sockets. In Linux, this pair of sockets can be used as the file descriptor returned by pipe. The only difference is that any one of these file descriptors can be read and writable.
This seems to be a good way to implement the inter-process communication pipeline. However, it should be noted that in order to solve the sort application problem I mentioned earlier, we need to disable the standard input of the sub-process to notify the sub-process that the data has been sent, then read data from the standard output of the sub-process until EOF is encountered. If two one-way pipelines are used, each pipeline can be closed independently, so there is no problem. When two-way pipelines are used, if the pipeline is not closed, the peer data cannot be notified that it has been sent, however, once the MPs queue is closed, the result data cannot be read from it. -- If this problem is not solved, the idea of using socketpair becomes meaningless.
Fortunately, shutdown call (5) can solve this problem. After all, the file descriptor generated by socketpair is a pair of sockets, and standard operations on the socket can be used, including shutdown. -- Shutdown can be used to implement a half-shutdown operation, notifying the peer process not to send data, and still using this file descriptor to receive data from the peer end. The code without error detection is as follows:
Int FD [2]; pid_t PID; socketpair (af_unix, socket_stream, 0, FD); // create a pipeline if (pid = fork () = 0) {// sub-process close (FD [0]); // closes the dup2 (FD [1], stdout_fileno) of the parent process of the MPs queue ); // copy the sub-process of the MPs queue to the standard output dup2 (FD [1], stdin_fileno); // copy the sub-process of the MPs queue to the standard input close (FD [1]); // close the copied read pipeline/* execute the command using exec */} else {// close the parent process (FD [1]); // close the sub-process end of the MPs queue/* you can read and write data in FD [0] */Shutdown (FD [0], shut_wr ); // notification peer data is sent/* read remaining data */close (FD [0]); // close the pipeline/* use wait functions to wait for the sub-process to exit and obtain the exit code */} |
It is clear that this is much simpler than using two one-way pipelines. I will further encapsulate and improve it on this basis.
Back to Top
Encapsulation and implementation
Using the above method directly, no matter what you think, is at least ugly and inconvenient. The program maintainer wants to see the logic of the program, rather than the complicated details of completing a task. We need a good encapsulation.
C or C ++ can be used for encapsulation. Here, I provide a C encapsulation similar to the popen/pclose function call in POSIX standards in a UNIX tradition to ensure maximum availability. The interface is as follows:
FILE *dpopen(const char *command);int dpclose(FILE *stream);int dphalfclose(FILE *stream); |
Pay attention to the following points about interfaces:
- Similar to the pipe function, dpopen returns a pointer to the file structure rather than a file descriptor. This means that you can directly use functions such as fprintf. The File Buffer caches the data written to the pipeline (unless you disable the File Buffer using the setbuf function ), to ensure that the data is indeed written to the MPs queue, the fflush function is required.
- Because dpopen returns a read/write pipeline, the second parameter of popen that represents read/write is no longer needed.
- In a two-way pipeline, we need to notify the peer that data writing has ended. This operation is completed by the dphalfclose function.
For specific implementation, please directly view the program source code, including detailed comments and doxygen document comments (6 ). I will only give a few notes:
- This implementation uses a linked list to record the correspondence between all the file pointers opened by dpopen and the sub-process IDs. Therefore, when there are many pipelines opened by dpopen at the same time, dpclose (you need to search for a linked list) is a little slower. In my opinion, this will not cause any problems during normal use. If this is a problem in some special cases, you can consider changing the return value type of dpopen and the input parameter type of dpclose (not easy to use, but easy to implement ), you can also use a hash table or a balance tree to replace the currently used linked list to accelerate search (the interface remains unchanged, but the implementation is complicated ).
- When the "-pthread" command line parameter is used in GCC during compilation, this implementation enables POSIX thread support and uses mutex to protect access to the linked list. Therefore, this implementation can be safely used in the POSIX multi-threaded environment.
- Similar to popen (7), dpopen closes the pipelines previously opened with dpopen In The subprocesses generated by fork.
- If the parameter passed to dpclose is not a non-null value returned by dpopen, The errno is set to ebadf in addition to the returned-1 error. For pclose, this situation is considered unspecified in the POSIX specification (8 ).
- The implementation does not use any platform-related features to facilitate porting to other POSIX platforms.
The following code shows a simple example of sending multiple lines of text to sort, then retrieving and displaying the results:
#include <stdio.h>#include <stdlib.h>#include "dpopen.h"#define MAXLINE 80int main(){ char line[MAXLINE]; FILE *fp; fp = dpopen("sort"); if (fp == NULL) { perror("dpopen error"); exit(1); } fprintf(fp, "orange\n"); fprintf(fp, "apple\n"); fprintf(fp, "pear\n"); if (dphalfclose(fp) < 0) { perror("dphalfclose error"); exit(1); } for (;;) { if (fgets(line, MAXLINE, fp) == NULL) break; fputs(line, stdout); } dpclose(fp); return 0;} |
Output result:
Back to Top
Summary
This article describes how to use the socketpair system call to implement a two-way process Communication Pipeline on Linux, and provides an implementation. The interface provided by this implementation is similar to the popen/pclose function in the POSIX specification, so it is very easy to use. This implementation does not use platform-related features, so it can be transplanted to the POSIX system that supports socketpair calls without modification or a few modifications.