File I/O tutorial in linux

Source: Internet
Author: User
This article introduces the file I/0 in linux. for the kernel, all opened files are referenced by file descriptors. Each process has some associated file descriptors. The file descriptor is a non-negative integer. When you open an existing file or create a new file

Linux file I/O tutorial (1)

1. file descriptor
For the kernel, all opened files are referenced by file descriptors. Each process has some associated file descriptors. The file descriptor is a non-negative integer. When an existing file is opened or a new file is created, the kernel returns a file descriptor to the process. When reading or writing a file, use the file descriptor returned by open or creat to identify the file and send it as a parameter to read and write.

Generally, there are three and open file descriptors. they are:


Copy codeThe code is as follows:
0: standard input STDIN_FILENO
1: STDOUT_FILENO
2 Standard error output STDERR_FILENO

The symbolic constants behind each row follow POSIX.
Open function


Copy codeThe code is as follows:
# Include
# Include
# Include
Int open (const char * pathname, int flags );
Int open (const char * pathname, int flags, mode_t mode );

Pathname is the name of the file to be opened or created.

Flag is used to define the action taken to open a file. one of the following modes must be called:
O_RDONLY, O_WRONLY, and O_RDWR indicate read-only, write-only, and read-write.

Open can also include a combination of the following optional modes
O_APPEND: append the written data to the end of the file.
O_CREAT: creates an object if it does not exist. When this option is used, the third parameter mode is required to specify the access permission for the new file.
O_EXCL: if O_CREAT is specified and the file exists, an error occurs. This can be used to test whether a file exists. If yes, the file is created, which makes the test and creation an atomic operation.
O_TRUNC: If the file exists and is successfully opened for write-only or read/write operations, the length of the file is cut to 0.

The file descriptor returned by open must be the smallest unused descriptor value. This is used by some applications for standard input, standard output, or standard error output. For example, if a program closes its standard output and calls open again, file descriptor 1 is called and the standard output is effectively redirected to another file or device.

The POSIX specification also standardizes a creat call, which is equivalent
Open (pathname, O_WONLY | O_CREAT | O_TRUNC, mode );

Close function
# Include
Int close (int fd );

The close call terminates the association between a file descriptor fd and the corresponding file. The file descriptor is released and can be reused. 0 is returned for successful close call, and-1 is returned for error.

When a file is closed, all record locks added to the file by the process will be released. When a process is terminated, the kernel automatically closes all open files.

Lseek function
Each opened file has a "current file offset" associated with it ". By default, when a file is opened, the offset is set to 0 unless the O_APPEND option is specified. Lseek can set an offset for an opened file.


Copy codeThe code is as follows:
# Include
# Include
Off_t lseek (int fd, off_t offset, intwhence );

Offset is used to specify the position. The whence parameter defines the usage of this offset value. Whence has the following values:


Copy codeThe code is as follows:
SEEK_SET: The offset is set to offset bytes.
SEEK_CUR: The offset is set to its current locationplus offset bytes.
SEEK_END: The offset is set to the size of the fileplus offset bytes.

If the call succeeds, the byte offset value from the file header to the file pointer is returned. if the call fails,-1 is returned. The offset parameter is defined in .

When the offset is greater than the file length, holes appear, and holes do not occupy the storage area.

Read function


Copy codeThe code is as follows:
# Include
Ssize_t read (int fd, void * buf, size_tcount );

Put the count characters in the file associated with the file descriptor fd into the buf. Returns the number of bytes read, which may be smaller than the number of bytes requested. If 0 is returned for the read call, it indicates that no data is read and the end of the file has been reached. If-1 is returned, an error occurs.

Write function


Copy codeThe code is as follows:
# Include
Ssize_t write (int fd, const void * buf, size_t count );

Write the first count of the buffer buf into the file associated with the file descriptor fd. Returns the actual number of bytes written, usually the same as the count value; otherwise, an error occurs. A common cause of an error is that the disk is full or exceeds the file length limit of a given process.

Instance: creates a file, writes data, moves the current offset, and reads data.


Copy codeThe code is as follows:
# Include // It must first appear because it may affect other header files. # Include
# Include
# Include
# Include
Int main ()
{
Char * filename = ". // file ";
Char buf [100];
Char buf1 [5];
Int fd;

Printf ("open a file to write \ n ");
If (fd = open (filename, O_RDWR | O_CREAT | O_TRUNC, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH) =-1)
{
Perror ("cannot open file \ n ");
Return 1;
}
Printf ("open file successfully! \ N ");
Printf ("input a string :");
Gets (buf );
// Write writable file
If (write (fd, buf, strlen (buf ))! = Strlen (buf ))
{
Perror ("cannot write failed file \ n ");
Return 1;
}
Close (fd );

Printf ("open file to read. \ n ");
If (fd = open (filename, O_RDONLY) =-1)
{
Perror ("cannot open thefile. \ n ");
Return 1;
}
If (lseek (fd, 3, SEEK_SET) =-1)
{
Perror ("lseek erroe \ n ");
Return 1;
}
// Read from the file
If (read (fd, buf1, 4) =-1)
{
Perror ("read error. \ n ");
Return 1;
}
Printf ("read from file is % s \ n", buf1 );
Close (fd );

Return 0;
}

Execution and output results:


Copy codeThe code is as follows:
Root @ jb51 :~ $ Gcc-o io. c
Root @ jb51 :~ $./Io
Open a file towrite
Open filesuccessfully!
Input a string: akxivbaslzkncxcasbxbwwvaidxbd
Open file toread.
Read from fileis ivba

Linux file I/O tutorial (2)

The following describes the file I/O related content in linux. the kernel uses three data structures to indicate open files, the relationship between them determines the possible impact of one process on another process in file sharing.

1. file sharing
The kernel uses three data structures to indicate opened files. The relationship between them determines the possible impact of one process on another process in terms of file sharing.
1) each process has a record item in the process table. The Record item contains an open file description table, which can be considered as a vector, and each descriptor occupies one item. Associated with each file descriptor is:
A) file descriptor flag
B) pointer to a file table item
2) the kernel maintains a file table for all open files. Each file table item includes:
A) file status signs (read, write, read/write, add, synchronize, and block)
B) current file offset
C) pointer to the table entry of the file v node
3) each open file (or device) has a v-node structure. The v node contains the pointer of the function for various operations on the file type and the comparison file. For most files, the v node also contains the I node of the file. The I node contains the file owner, the file length, the device where the file is located, and the pointer to the actual data block of the file on the disk.

Open the file kernel data structure

If the two processes open the same file, see figure 2. Assume that the first process opens the file on file descriptor 3, and the other process opens the file on file descriptor 4. Each process gets a file table item, but there is only one v node table item for a given file. One reason why each process has its own file table items is that each process has its own current offset for the price.

I/O (1) operations in the previous section are further described as follows:
1. after the write operation is completed, the current offset in the file is the number of bytes added. If the current offset is greater than the file length, set the current file length in node I to the current file offset.
2. open a file with O_APPEND, and the corresponding flag will be set to the file status identifier. At each write, the current offset is set to the file length in the I node.
3. when lseek locates at the end of the file, the current offset of the file is set to the length of the current file.

Multiple file descriptors may point to the same file table. This can be seen when dup and fork are called.
Multiple processes can read the same file correctly. However, when multiple processes write the same file, unexpected consequences may occur. Atomic manipulation can be used to avoid this situation.

Atomic operation
Generally, an atomic operation refers to an operation composed of multiple parts. If the hospital executes the process from the place where it is located, either the steps are completed or one step is not performed.
1. add to a file
Consider a process that adds data to the end of a file. Early UNIX systems do not support open, so they can be implemented as follows:


Copy codeThe code is as follows:
If (lseek (fd, 0L, 2) <0)
Err_sys ("lseekerror ");
If (write (fd, buf, 100 )! = 100)
Err_sys ("writeerror ");

This program works normally for a single process. But multiple processes are not necessarily. Both process A and process B add the same file. Each process opens the file, as shown in figure 2. Assume that A calls lseek, and sets the current offset of A to 1500. Process B executes lseek and sets its current offset to 1500. Then B calls write to increase the current offset to 1600. Then, the kernel implements another process switch to resume process A. When process A Calls write, it writes data from its current offset of 1500 and replaces the data that B has just written to the file.

The problem lies in the logic operation "locate to the end of the file, and then write" using two separate function calls. The solution is to make these two operations an atomic operation. The O_APPEND identifier. before the kernel writes the file, it sets the current offset of the process to the end of the file.

2. pread and pwrite functions
Search and execute I/0 atomically.


Copy codeThe code is as follows:
# Include
Ssize_t pread (int fd, void * buf, size_tcount, off_t offset );
Ssize_t pwrite (int fd, const void * buf, size_t count, off_t offset );
Ssize_t pread (int fd, void * buf, size_tcount, off_t offset );
Ssize_t pwrite (int fd, const void * buf, size_t count, off_t offset );

Dup and dup2 functions


Copy codeThe code is as follows:
# Include
Int dup (int oldfd );
Int dup2 (int oldfd, int newfd );

The preceding two functions can be used to copy an existing file descriptor.

The new file descriptor returned by dup must be the minimum value in the currently available file descriptor. With dup2, you can use the newfd parameter to specify the value of the new descriptor. If newfd is enabled, disable it first. If newfd is equal to oldfd, dup2 returns newfd without disabling it.

. 3 shows this situation.

Assume that our process is executed:

Newfd = dup (1 );

When this function is executed, assume that the next available descriptor is 3. Because these two descriptors point to the same file table, they share the file tag and the same file offset.

Sync, fsync, and fdatasync


Copy codeThe code is as follows:
# Include
Void sync (void );
Int fsync (int fd );
Int fdatasync (int fd );

When writing data to a file, the kernel usually copies the data to a buffer until the buffer is full, and then routes the buffer to the output queue, waiting for it to arrive at the beginning of the queue, to perform the actual I/O operation. This type of output heatstroke prevention is called delayed writing. Delayed write reduces the number of reads and writes to the disk, but reduces the speed of file content and new content. When the system fails, delayed writing may cause loss of files and new content. To ensure the consistency between the actual file system on the disk and the buffer cache, UNIX provides three functions: sync, fsync, and fdatasync.

Fcntl functions


Copy codeThe code is as follows:
# Include
# Include
Int fcntl (int fd, int cmd,.../* arg */);

Can change the nature of opened files.

Copy an existing descriptor (cmd = F_DUPFD)
Get or set the file descriptor (cmd = F_GETFD | F_SETFD)
Get or set the file status flag (cmd = F_GETFL | F_SETFL)
Obtain or set the asynchronous I/O ownership (cmd = F_GETOWN | F_SETOWN)
Obtain or set the record lock (cmd = F_GETLK | F_SETLK, F_SETLKW)

You can use the fcntl function to set the file status. it is often used to set the socket descriptor to be non-blocking O_NONBLOCK.

Ioctl function
# Include
Int ioctl (int d, int request ,...);

Provides an interface for controlling device and its descriptor behavior and configuring underlying services.

/Dev/fd
Opening a file/dev/fd/n is equivalent to copying the descriptor n.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.