DUP and dup2 for Linux handle redirection

Source: Internet
Author: User
DUP and dup2 are two very useful calls. They are used to copy the descriptor of a file. They are often used to redirect stdin, stdout, and stderr of processes. The original form of these two functions is as follows: # include <unistd. h> int DUP (INT oldfd); int dup2 (INT oldfd, int targetfd) using the function dup, we can copy a descriptor. If you pass it to the function an existing descriptor, it will return a new descriptor, which is a copy of the descriptor sent to it. This means that the two descriptors share the same data structure. For example, if we perform the lseek operation on a file descriptor, the location of the first file is the same as that of the Second file. The following code snippet describes how to use the DUP function: int fd1, fd2 ;... fd2 = DUP (fd1); note that we can create a descriptor before calling fork, which is the same as calling DUP to create a descriptor, the sub-process will also receive a copied descriptor.

The dup2 function is similar to the DUP function, but the dup2 function allows the caller to specify the ID of a valid Descriptor and a target descriptor. When the dup2 function returns a successful result, the target Descriptor (the second parameter of the dup2 function) is changed to a replica of the source Descriptor (the first parameter of the dup2 function). In other words, the two file descriptors now point to the same file and are the files pointed to by the first parameter of the function. The following code is used to describe:


Int oldfd; oldfd = open ("app_log", (o_rdwr | o_create), 0644); dup2 (oldfd, 1); close (oldfd); in this example, we opened a new file called "app_log" and received a file descriptor called fd1. We call the dup2 function with the oldfd and 1 parameters. This will cause the file descriptor represented by 1 to be replaced with the newly opened file descriptor (that is, stdout, because the ID of the standard output file is 1 ). Everything written to stdout is now written into the file named "app_log. It should be noted that after the dup2 function copies the oldfd, it will immediately close it, but will not turn off the newly opened file descriptor, because file descriptor 1 now points to it.
Next we will introduce a more in-depth sample code. Recall the command line pipeline mentioned earlier in this article, where we connect the standard output of the LS-1 command to the WC-l command as the standard input. Next, we will use a C program to illustrate the implementation of this process. Sample Code 3 is shown in the following code.
In Example code 3, first create an MPS queue in the code of line 3, and then divide the application into two processes: one sub-process (line 13-16) and a parent process (LINE 20-23 ). Next, in the sub-process, first disable the stdout Descriptor (13th rows), and then provide the LS-1 command function, but it is not written to stdout (13th rows ), instead, it is written to the input end of the pipeline we created. This is done through the DUP function. In row 3, use the dup2 function to redirect stdout to the pipeline (PFDS [1]). Then, immediately turn off the input of the pipeline. Then, use the execlp function to replace the sub-process image with the command ls
-1 process image. Once the command is executed, any output will be sent to the input end of the pipeline.
Now let's look at the receiver of the pipeline. It can be seen from the code that the receiving end of the pipeline is undertaken by the parent process. First, disable the stdin Descriptor (line 1) because we will not receive data input from standard device files such as the keyboard of the machine, but from the output of other programs. Then, the dup2 function (21st rows) is used again to change stdin to the output end of the pipeline, which is achieved by making the file descriptor 0 (that is, the conventional stdin) equal to PFDS [0. Close the stdout end of the MPs Queue (PFDS [1]) because it is not used here. Finally, use the execlp function to replace the image of the parent process with the WC command.
-1 process image. The command WC-1 uses the content of the pipeline as its input (line 1 ).
Example code 3: Use C to implement code for command line operations
1: # include <stdio. h> 2: # include <stdlib. h> 3: # include <unistd. h> 4: 5: int main () 6 :... {7: int PFDS [2]; 8: 9: If (pipe (PFDS) = 0 )... {10: 11: If (Fork () = 0 )... {12: 13: Close (1); 14: dup2 (PFDS [1], 1); 15: Close (PFDS [0]); 16: execlp ("ls ", "ls", "-1", null); 17: 18:} else... {19: 20: Close (0); 21: dup2 (PFDS [0], 0); 22: Close (PFDS [1]); 23: execlp ("WC ", "WC", "-l", null); 24: 25:} 26: 27:} 28: 29: Return 0; 30 :}
In this program, we need to pay special attention to the fact that our child process redirects its output to the input of the pipeline, and then the parent process redirects its input to the output of the pipeline. This is a very useful technology in actual application development. 1. Data Structure of file descriptors in the kernel
Before specifying DUP/dup2, I think it is necessary to first understand the form of file descriptors in the kernel.
When a process exists, some files are opened, and some file descriptors are returned.
By default, three file descriptors exist (0, 1, 2). 0 is associated with the standard input of the process,
1. Associated with standard output of a process; 2. Associated with standard error output of a process.
You can view the file descriptor in the/proc/process ID/FD directory. You can clearly explain the problem:

Progress table item ----------------
FD mark file pointer _____________________ FD 0: | ________ | ____________ | ------------> file table FD 1: | ________ | ____________ | FD 2: | ________ | ____________ | FD 3: | ________ | ____________ | ....... | _________________ |
Figure 1 file table contains: File status mark, current file offset, and V node pointer, which are not discussed in this article
The key point is that we only need to know that each open file descriptor (FD mark) has its own file table in the progress table.
Object Pointer.
2. DUP/dup2 Functions
The apue and man documents use a simple sentence to express the functions of these two functions: copying an existing file descriptor.
# Include <unistd. h>
Int DUP (INT oldfd );
Int dup2 (INT oldfd, int newfd );
This process is analyzed from figure 1. When the DUP function is called, the kernel creates a new file descriptor in the process.
The descriptor is the minimum value of the currently available file descriptor. This file descriptor points to the file table items owned by oldfd.

Progress table item ----------------
FD mark file pointer _____________________ FD 0: | ________ | ____________ | ______ FD 1: | ________ | ____________ | ----------------> | FD 2: | ________ | ____________ | file table | FD 3: | ________ | ____________ | ---------------- >|______ | ....... | _________________ |
Figure 2:
2. If the value of oldfd is 1 and the minimum value of the current file descriptor is 3, the new descriptor 3 points
File Table items owned by descriptor 1.
The difference between dup2 and DUP is that you can use the newfd parameter to specify the value of the new descriptor. If newfd is enabled
Disable it first. If newfd is equal to oldfd, dup2 returns newfd without disabling it. The new value returned by the dup2 Function
The file descriptor shares the same file table item with the oldfd parameter.
Apue illustrates this problem using another method:
In fact, DUP (oldfd) is called );
Equivalent to fcntl (oldfd, f_dupfd, 0)
Call dup2 (oldfd, newfd );
Equivalent to close (oldfd); fcntl (oldfd, f_dupfd, newfd );
3. dup2 in CGI
Anyone who has written CGI programs knows that when the browser uses the POST method to submit form data, CGI reads data from the standard
Input stdin and write data to stdout (C language uses the printf function ). According to our normal principle
Solution: printf output should be displayed on the terminal. The original CGI program uses the dup2 function to convert stdout_finleno (this
Macro defined in unitstd. H, is 1) This file descriptor is redirected to the connection socket.
Dup2 (connfd, stdout_fileno);/* the actual situation also involves pipelines, not the focus of this article */
As stated in section 1, the default file descriptor 1 (stdout_fileno) of a process is consistent with the standard output stdout.
Associated. For the kernel, all open files are referenced by file descriptors, and the kernel does not know the stream
Exist (such as stdin, stdout), so the data output by the printf function to stdout is finally written to the file description
Character 1. File descriptors 0, 1, and 2 are associated with standard input, standard output, and standard error output.
It's just shell and many applications, but it's not related to the kernel.
The following flow chart can be used to illustrate the problem: (PS: although it is not a flow chart relationship, it is helpful to understand)
Printf-> stdout-> stdout_fileno (1)-> terminal (TTY)
The final output of printf is to the terminal device. The file descriptor 1 points to the current terminal, which can be understood as follows:
Stdout_fileno = open ("/dev/tty", o_rdwr );
After dup2 is used, stdout_fileno no longer points to the terminal device, but to connfd.
The output is written to connfd. Is it beautiful? :)
4. How to restore stdout_fileno In the Fork sub-process of the CGI program
If you can see this, thank you for your patience. I know that many people may feel a little complicated.
A complex problem is a collection of small problems. So it's okay to figure out every small problem. Section 3
Stdout_fileno is redirected to the connfd socket. Sometimes we may want
And some input and output are inevitable in these scripts. After fork is known,
The child process inherits all the file descriptors of the parent process, so the input and output of these scripts are not as expected.
Output to the terminal device, but associated with connfd, this will obviously disrupt the output of the webpage. So how?
Restore stdout_fileno and terminal Association?
Method 1: Save the original file descriptor before dup2 and restore it.
The code is implemented as follows:
Savefd = DUP (stdout_fileno);/* savefd points to the terminal */
Dup2 (connfd, stdout_fileno);/* stdout_fileno (1) is redirected to connfd */
.../* Handle some things */
Dup2 (savefd, stdout_fileno);/* stdout_fileno (1) restore to savefd */

Unfortunately, the CGI program cannot use this method, because dup2 is not completed in the CGI program, but in
It is not a good idea to modify the web server.
Method 2: trace the source and open the current terminal to restore stdout_fileno.
How is stdout_fileno associated with the terminal when analyzing the flow chart in Section 3? Let's just try again.
The code is implemented as follows:
Ttyfd = open ("/dev/tty", o_rdwr );
Dup2 (ttyfd, stdout_fileno );
Close (ttyfd );
/Dev/tty is the terminal where the program runs, which should be obtained in one way. Practice has proved this method
It is feasible, but I always feel a bit inappropriate. I don't know why, maybe some potential problems haven't appeared yet. From: http://www.douban.com/note/166217997/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.