Unix programming learning notes (3) -- kernel I/O data structure of file I/O

Source: Internet
Author: User
Tags switches

Lien000034
2014-08-27

The kernel uses three data structures to indicate open files: file descriptor table, file table, and V node table.

(1) Each process has a record item in the process table. The record item contains an open file descriptor table, and each descriptor occupies one item. Associated with each file descriptor is:

(A) file descriptor flag.

(B) pointer to a file table item.

(2) The kernel maintains a file table for all open files. Each file table item includes:

(A) File status signs (reading, writing, adding, synchronization, and non-blocking ).

(B) offset of the current file.

(C) pointer to the table entry of the file v node.

(3) Each open file (or device) has a V-node structure. The V node contains the file type and pointer to the functions that perform various operations on the file. The V Node also contains information about the I-node read from the disk, the I node information includes the file owner, the file length, the device where the file is located, and the pointer to the actual data block of the file on the disk.

Figure 1 shows the relationship between three tables of a process. This process has two different open files. One file is opened as the standard input (the file descriptor is 0), and the other is opened as the standard output (the file descriptor is 1 ).

Figure 1: The kernel data structure of a process to open two files

Figure 2 shows the kernel data structure in which two processes open the same file. Assume that the first process opens the file on file descriptor 3, and the other process opens the file on file descriptor 4. Each process that opens the file obtains a file table item, but there is only one V node table item for a given file.

Figure 2: two processes open the kernel data structure of the same file

After learning about these kernel data structures, you can easily understand the following content,

• Each process has its own current offset to open the file.

• After each write operation, the current file offset in the file table item increases the number of bytes written. If this causes the current file offset to exceed the current file length, the current file length in the I node table item is set to the current file offset.

• If a file uses lseek to locate the current end of the file, the current file offset in the file table item is set to the current file length in the I node table item. (This is different from the o_append flag to open a file. After lseek is used to locate the end of the file, the next time you call write to write data, it is not necessarily the end Of the file. Because lseek and write are not atomic operations, there can be another process in the middle that has extended the file length.)

Consider the following code snippet,

if (lseek(fd, 0L, SEEK_END) < 0) {    printf("lseek error");}if (write(fd, buf, 100) < 100) {    printf("write error");}

If it is a single process, the preceding program fragment can normally add data to the end of the file. However, if multiple processes simultaneously use this method to add the data to the same file, the problem may occur. Assume that two processes A and B use the preceding program fragment to add the same file, the kernel data structure 2 is shown. Assume that process a calls lseek to set the current offset of process a to 1000 bytes (at the end of the current file ). Then the kernel switches to process B for execution, and process B executes lseek to set the current offset of the file of process B to 1000 bytes (at the end of the current file ), then process B calls write to write 100 bytes (the file length changes to 1100 bytes ). Then, the kernel switches to process a to run. process a runs write and writes the data from its current file offset (1500 bytes) to the file, so it overwrites the data written by process B.

In the original case of this problem, lseek positions and writes data in sequence, instead of an atomic operation, which will be interrupted by the kernel. UNIX provides an o_append option to handle this situation (see the following description ).

• If a file is opened with the o_append flag, the corresponding flag is set to the file status flag of the file table item.When a write operation is performed on a file with the "add write" sign, the current file offset in the file table item is first set to the file length in the I node table item. This adds the data written each time to the current end of the file.

• The lseek function only modifies the current file offset in the file table item and does not perform any I/O operations.

Unix programming learning notes (3) -- kernel I/O data structure of file I/O

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.