Linux file stream, buffer, descriptor, and inter-process relationship

Source: Internet
Author: User

 

The relationship between linux (unix) processes and files is complex. This tutorial attempts to elaborate on this issue in detail. Including: 1. association between multi-process and multi-/single-file in linux for file streams and descriptors and some precautions. 2. Effect of fork and vfork stream Buffering on file operations. 1 , Linux File System StructureFirst, I want to add some basic knowledge to understand the linux File System. As shown in: Figure 1 the disk, partition, and file system should understand that the structure shown is a logical representation of the file storage method in the hard disk and is irrelevant to the process. For some of the terms, see the following explanation. I node: contains almost all files/directories-suitable for storing information stored on hard disks for a long time. For example, the file owner, the file type, the I-node number (stored in the directory block), the primary and secondary device number, the connection count, the access/modification time, the IO block length, and the number of file bytes. You can use the stat function (# include <sys/stat. h>) to obtain information about related I nodes. 2 Simple process and file relationshipsNext, let's take a look at the relationship between single-process, multi-file and multi-process, single-file. Without considering fork (parent-child process), apart from the assignment statement, it brings us a little trouble, this problem is quite easy.   2.1 A process can open multiple files at the same time: Figure 2 kernel data structure when a process opens two different files at the same time. We have already introduced the v node, because it does not contain much content except the I node, let's take a look at the file table. For a file table, note that it is not something in the hard disk. It can be said that it is a part of the process (it may be maintained by the operating system kernel. I have not verified it, because whether it is a process or a kernel is irrelevant to the issues we want to discuss ). File tables include: File status signs: including reading, writing, adding, synchronization, and non-blocking. V node: the file type and pointer of the function for various operations on the file. The information is read from the disk into the memory when the file is opened. I node: As described in the preceding section. Use the fcntl function (# include <fcntl. h>) to modify the file table content. 2.2 Multiple non-associated processes simultaneously open one file:Figure 3 kernel data structure when two processes open the same file at the same time. At this time, two files use different file tables, which indicates the File status mark between different processes, file offsets are independent. But they share the same v node table. This structure is easy to understand. 3 , File descriptor or stream ReplicationFor file descriptor or stream replication, we will use the value assignment statement in many cases. The following describes the differences between the value assignment and dup. The dup function copies the kernel data structure of the file descriptor: figure 4 execute the dup function (# include <unistd. h>) the kernel data structure after the file descriptor is copied. In this case, the two fd file logos use the same file table at the same time. 3.1 , Dup Difference from the value assignment statement used for file descriptors To understand the differences between dup and the value assignment statement for file descriptors, see the following program.Program Description: open a file descriptor, which applies dup and value assignment statements respectively for copying. After copying, print the original and copied file descriptor IDs to see if they have the same value, close the file and test whether the close is successful. Program example: # include <stdio. h> # include <stdlib. h> # include <fcntl. h> # include <unistd. h> int sys_err (char * str) {puts (str); exit (0);} int main (void) {int p, q; if (p = open ("c_fid.c", O_RDONLY) =-1) sys_err ("open error"); q = dup (p); puts ("dup: "); printf (" file p, q fd is: % d \ n ", q, p); printf (" close file p OK?: % D \ n ", close (p); printf (" close file q OK?: % D \ n ", close (q); if (p = open (" c_fid.c ", O_RDONLY) =-1) sys_err (" open error "); q = p; puts ("=:"); printf ("file p, q fd is: % d \ n", q, p ); printf ("close file p OK?: % D \ n ", close (p); printf (" close file q OK?: % D \ n ", close (q); return 0;} program running result: dup: file p, q fd is: 4 3 // file p, q use different file descriptors to close file p OK?: 0 close file q OK?: 0 // file closed successfully =: file p, q fd is: 3 3 // simple copy close file p OK?: 0 close file q OK? :-1 // failed to close because the descriptor has been disabled, which proves that dup generates a new file descriptor id and pointer, but they share the file table. effect 4, in this case, disable one file descriptor and the other is still available. The file table will not be released. The assignment statement is different. It simply records the original file pointer in another variable. The file descriptors of the two variables are the same, and no new project is generated in the input table. 3.2 And assign value statement to copy the standard stream.Example: # include <stdio. h> # include <stdlib. h> int sys_err (char * str) {puts (str); exit (0);} int main (void) {FILE * p, * q; if (p = fopen ("c_fid.c", "r") = NULL) sys_err ("open error"); q = p; printf ("FILE p, q fd is: % d \ n ", fileno (q), fileno (p); printf (" close file p OK? : % D \ n ", fclose (p); printf (" close file q OK? : % D \ n ", fclose (q); return 0;} program execution result: FILE p, q fd is: 3 3 // two streams share the same file descriptor close file p OK? : 0 *** glibc detected *** // an error occurs when the program is closed twice, causing program crash ............. 4 , Introduce fork Post-process-file relationships and the impact of stream Buffering on file operations 4.1 forkFigure 5 after fork is used, after fork is used for file opening sharing between parent and child processes, the child process inherits the file table opened by the parent process, the Parent and Child processes use the same file table (as shown in the preceding figure, the file table is maintained by the kernel because it is correct because the file table is not copied ), this indicates that the two processes will share the File status, file offset, and other information. If you use close to close the file descriptor in a process, the descriptor is still valid in another process and the corresponding file table will not be released. It should be noted that when using c standard io for file read/write, the process that ends first will also include the data in the buffer into the length of the file offset. For the corresponding File Buffer type and length, you can use setbuf and setvbuf (# include <stdio. h>. Program example: # include <stdio. h> # include <stdlib. h> # include <fcntl. h> # include <unistd. h> # include <time. h> int main (void) {int pid; FILE * p; char buff [20] = {0}; int test =-2; if (p = fopen ("/media/lin_space/soft/netbeans-6.5-ml-cpp-linux.sh", "r") = NULL) {// file greater than 4096 bytes puts ("open error. "); exit (0);} if (pid = fork () <0) {puts (" fork error "); exit (0 );} else if (pid = 0) {sleep (2); test = ftell (p); // returns the current offset pr Intf ("\ nchild-ftell is: % d \ n", test); if (buff [0] = fgetc (p ))! = EOF) printf ("child-fgetc is: % c \ n", buff [0]); else puts ("child-fgetc error \ n "); test = ftell (p); printf ("child-ftell is: % d \ n", test);} else {test = ftell (p ); printf ("\ nparent-ftell is: % d \ n", test); if (buff [0] = fgetc (p ))! = EOF) printf ("parent-fgetc is: % c \ n", buff [0]); else puts ("parent-fgetc error \ n "); test = ftell (p); printf ("parent-ftell is: % d \ n", test);} printf ("parent and child-close file OK?: % D \ n ", fclose (p); return 0;} execution result: parent-ftell is: 0 parent-fgetc is: # parent-ftell is: 1 parent and child-close file OK?: 0freec @ freec-laptop:/media/lin_space/summa/apue/unit8 $ // parent process ended child-ftell is: 4096 // The sub-process file offset is 4096, the reason is that the file Buffer type is full buffer, and the buffer size is 4096 child-fgetc is: achild-ftell is: 4097 parent and child-close file OK?: 0 // file closed successfully 4.2 vforkFor vfork, although the sub-process runs in the space of the parent process, the sub-process has its own entry table (including the process pid, fd mark, and file pointer). Therefore, after one process ends, the file descriptor of The other process is still valid. The child process obtains a copy of the parent process file descriptor. However, for standard streams, vfork may have different conditions. After a process ends (if _ exit () is not used to end the process), the stream buffer may be flushed when the process ends, and close the stream. Whether or not to do so depends on the specific implementation.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.