Apue Reading notes-Chapter III file I/O

Source: Internet
Author: User

Today to see very fast, suddenly the second chapter read, but the second chapter also does not see carefully, this chapter in fact in the program design is still very important, because the content of this chapter determines the portability of the program.

Well, back to this chapter of the topic file I/O.

Section 3.2 is a brief introduction to the concept of file descriptors. According to Apue: The file descriptor is a non-negative integer. When an existing file is opened or a new file is created, the kernel returns a file descriptor to the process. I also simply turned over the LKD and "deep understanding of the Linux kernel", where the description of the file descriptor is not a lot, so for the file descriptor also can not talk about too deep understanding, everyone still share a blog bar.

http://blog.csdn.net/cywosp/article/details/38965239

The Linux system also associates file descriptors 0, 1, and 2, respectively, with the process's standard input, standard output, and standard error. The following definitions can be used during programming:

#defineSTDIN_FILENO0/* Standard input.  */#defineSTDOUT_FILENO1/* standard output.  */#defineSTDERR_FILENO2/* standard error output.  */

The above definition is in/usr/include/unistd.h.

The change range of the file descriptor is 0-open_max-1. On my machine, use the "ulimit-n" command to query the current shell and the number of file descriptors that the process initiated by it can have, the result of which is 1024 on my machine and can be modified by the "ulimit-n N" command, where the last n represents the maximum number of file descriptors. However, the above method can only be valid at the current terminal, and after exiting, open files becomes the default value. can also be modified by the way the file under the/etc/, but I do not have a clear relationship to these documents, here is not to share. The above is the method to modify the maximum number of file descriptors for a shell, and then to see how to modify the System series values, before the change to see the relevant content, you can query by the following command "sudo Cat/proc/sys/fs/file-max", on my machine The result is " 402307 ". The modified method is through the "6553560 >/proc/sys/fs/file-max" or "sysctl-w" fs.file-max=34166 "" command, but the above command fails after the machine restarts, so modify/ Etc File method is the one-to-one method.

The above changes learn from this blog: "Http://coolnull.com/2796.html".

All of the above is about Open_max, and in <stdio.h> there is also a "# define Fopen_max 16" (specifically defined in the/usr/include/x86_64-linux-gnu/bits/ stdio-lim.h).

Section 3.3 Officially begins with programming, opening or creating a file that is defined in my file as follows:

#include <fcntl.h> #ifndef __use_file_offset64extern int open (const char *__file, int __oflag, ...) __nonnull ((1)); #else # ifdef __redirectextern int __redirect (open, (const char *__file, int __oflag, ...), Open64)    &nbsp ; __nonnull ((1)); # else#  define open open64# endif#endif#ifdef __use_largefile64extern int open64 (const char *__file , int __oflag, ...) __nonnull ((1)); #endif # ifndef __use_file_offset64extern int openat (int __fd, const char *__file, int __oflag, ...)      __nonnull ((2)); # else#  ifdef __redirectextern int __redirect (openat, (int __fd, const CHA R *__file, int __oflag,                 ...), Openat64) __ Nonnull ((2));#  else#   define Openat openat64#  endif# endif# ifdef __use_largefile64extern int ope Nat64 (int __fd, const char *__file, int __oflag, ...)      __nonnull ((2)); # Endif#endif
Because it is a 64-bit machine, there are some 64-bit functions, Open64, Openat64. There's really nothing to say about the file name, about the Oflag option, to share a bit, these constants are defined in/usr/include/fcntl.h (as written in the book), and actually there's a "#include <bits/fcntl.h> ", in fact, this file does not include what we are looking for,/usr/include/x86_64-linux-gnu/bits/fcntl-linux.h file is the file we are looking for, the specific content will not be shared, including" O_rdonly ", "O_wronly", "O_rdwr", the book also gives the "O_exec" and "O_search" two options, but in my file is not found, so "o_rdonly", "o_wronly", "O_rdwr" The three flags must specify one and only one can be specified. Use "|" Between flags Operation. The "..." parameter represents the initial value of the file access permission, which is onlyThe third parameter is used when there is o_creat in the second argument . If not, the third parameter can be ignored. Take a look at some examples:

The first is to open the non-existent file, the source code is as follows:

#include <stdio.h> #include <fcntl.h> #include <errno.h>int main (int Argc,char *argv[]) {int n;if (n = Open ("./temp", O_RDWR)) <0) perror (argv[0]); return 0;}

The result of the operation is as follows, running error.

Gcc-o test_opennotcreate test_opennotcreate.c/test_opennotcreate./test_opennotcreate:no such file or directory

Let's experiment with it. Create a file, other options when used in the study bar, here I deliberately only read permissions to create a file, and then open the file in read-write mode, and write some content to it

#include <stdio.h> #include <fcntl.h> #include <errno.h>int main (int Argc,char *argv[]) {int n;if (n = Open ("./temp", o_rdwr| O_CREAT,S_IRUSR) <0) perror (argv[0]);        if ((n = write (Fd,str,strlen (str))) <0)        perror (argv[0]);        return 0;}
The results of the implementation are as follows:

Gcc-o test_opencreate test_opencreate.c./test_opencreate The content is written successfully./test_opencreate  execute the program again./test_opencreate: Permission Denied./test_opencreate:bad File Descriptor


There is no permission to create a file, do not know whether it is due to a problem with permission settings, or because the O_CREATE flag cannot be used for a file that already exists. This morning I studied again, errno there is a eexist error, the file already exists, but my program gives the error code is-1 ("no permission"), so I think I should be the authority of this piece still have the problem not clear, add a "user Write permission" try:

int main (int argc,char *argv[]) {int Fd,n;char str[] = "Hello,world"; if (fd = Open ("./temp", o_rdwr| o_creat,s_irusr| S_IWUSR) <0) {perror (argv[0]);} if (n = Write (Fd,str,strlen (str))) <0) perror (argv[0]); return 0;}

The results of the operation are as follows:

./test_opencreate./test_opencreate run the program again to work

It appears that there is no previous write permission to open the existing file can not be repeated (here or to save a question, why add a write permission to open the open file repeatedly), and through the experimental results can be found that each time you create or open a file that already exists, it will start at the beginning of the file to write. So add a o_append option and try again:

if (FD = Open ("./temp", o_rdwr| O_creat| o_append,s_irusr| S_IWUSR)) <0) {         

The content can be appended to the end of the file. Add a o_trunc option to try it out:

if (FD = Open ("./temp", o_rdwr| O_creat| o_append| o_trunc,s_irusr| S_IWUSR) <0) {perror (argv[0]);}

At this point the O_TRUNC option works First, truncates the file length to 0, and then uses o_append mode to write the file contents at the beginning of the file.

Here to talk to you about my understanding of the difference between o_creat and O_trunc, if only the o_creat option, then the file is always written from the beginning of the file, if the file already exists, it will overwrite the contents of the original file, if the use of o_creat| The O_trunc option, if the file already exists and is opened for write-only or read-write success, truncates its length to 0, that is, the contents of the original file are all deleted.

According to Apue, the open and Openat function areas are separated by the FD parameter, there are 33 possibilities.

    1. The path parameter specifies an absolute pathname, in which case the FD parameter is ignored and the Openat function is equivalent to the open function.
    2. The path parameter specifies a relative pathname, and the FD parameter indicates the starting address of the file system relative to the path name. The FD parameter of the Openat function is obtained by opening the directory where the relative pathname resides.
    3. The path parameter specifies a relative pathname, and the FD parameter is only AT_FDCWD with constants. In this case, the path name is obtained in the current working directory, and the Openat function is similar to the open function in operation.

Suddenly a look at the Openat function some chicken, just to create a file under a folder to add a new function, the specific function of Openat I am not very clear. Apue How to say, I will first give you directly to share over.

    1. Allows a thread to open a file in a directory using a relative pathname, instead of only the current working directory. All threads in the same process share the same current working directory, so it is difficult to have multiple different threads of the same process working in different directories at the same time.
    2. Avoid Time-of-check-to-time-of-use (Tocttou) errors. Here I think, seemingly openat and Tocttou error is not related.

The 3.4 creat function is prototyped as follows:

#include <fcntl.h>int creat (const char *path,mode_t mode);

The function is equivalent to:

Open (path,o_wronly| O_creat| O_trunc,mode);

The 3.5 close function, which can be used to close an open file. Closing a file also frees all record locks that the process adds to the file. When a process terminates, the kernel automatically closes all of its open files.

3.6 The Lseek function, the read and write operations for I/O functions typically start at the current file offset and increase the number of bytes read and written by the offset. By default, when you open a file, the offset is set to 0 unless you specify O_append, and you can call Lseek to set an offset for an open file, and the function prototype is as follows:

#include <unistd.h>off_t lseek (int fd,off_t offset,int whence)
Returns the new offset if it succeeds, or 1 if an error occurs.

The whence parameter consists of three choices, namely:

Seek_set: Sets the offset of the file to offset bytes from the beginning of the file.

Seek_cur: Sets the offset of the file to its current value plus offset,offset can be positive.

Seek_end: Set the offset of the file to the file length plus offset,offset can be negative.

A method for determining the current file offset is given in Apue:

off_t Currpos;currpos = Lseek (fd,0,seek_cur);

Sets the offset to the position of the current value +0, thereby obtaining the current file offset.

The above method can also be used to determine whether the file involved can set an offset. If the file descriptor points to a pipe, FIFO (named pipe), network socket, Lseek returns-1, and errno is set to Espipe.

An example of whether the standard input can set an offset is given in Apue, the source code is as follows:

#include <stdio.h> #include <unistd.h> #include <errno.h>int main (int argc,char* argv[]) {if (Lseek (ST    din_fileno,0,seek_cur) = =-1) perror (argv[0]);    else printf ("seek ok\n"); return 0;}

The results of the operation are as follows, and the direct run program discovers that the offset of standard input cannot be set.

./test_lseek./test_lseek:illegal Seek

by querying the corresponding errno of "illegal seek", it can be found that the following definition is in "/usr/include/asm-generic/errno-base.h".

#defineESPIPE29/* Illegal seek * *

The previous description can be used to conclude that the standard input is a pipe or FIFO.

The offset must be non-negative for different files. Since the offset may be negative, you should be cautious in comparing the return value of Lseek, and do not test whether it is less than 0 and whether it is equal to-1. Lseek only records the current file offset in the kernel, which can be used for the next read and write operation. The offset of the file can be greater than the length of the current file, in which case the next write to the file will be extended to the file, and the content of this part of the file is populated with 0, this part of the file is called "File hole." The "file hole" does not need to occupy disk space.

Verify by Experiment:

#include <fcntl.h> #include <stdio.h>char buf1[] = "Abcdefghij"; char buf2[] = "ABCDEFGHIJ"; int main (int argc, char* argv[]) {int fd;if (FD = open ("File.hole", o_rdwr| o_creat,s_irusr| S_IWUSR) <0) perror (argv[0]); if (write (fd,buf1,10)! =) perror (argv[0]); if (Lseek (fd,16384,seek_set) ==-1) perror (Argv[0]); if (write (fd,buf2,10)! =) perror (argv[0]); return 0;}

The results of the operation are as follows:

Gcc-o test_createhole test_createhole.c./test_createholeod-c File.hole 0000000 a b c d e F g h i J   \00000020, E-F, B.,,, and so on, \0*0040000 A, C D F G H I J0040012 ls-ls file.hole8-rw-------1 16394 June 14:53 file.hole

Then look at the situation without empty files, the source code is as follows:

#include <fcntl.h> #include <stdio.h>char buf1[] = "Abcdefghij"; char buf2[] = "Abcdefghij"; char buf3[] = "n"; int main (int argc,char* argv[]) {int Fd;int i;if (fd = open ("File.nohole", o_rdwr| o_creat,s_irusr| S_IWUSR) <0) perror (argv[0]); i = 0; while (i<16394) {if (write (fd,buf3,1)! = 1) perror (argv[0]); i++;} if (Lseek (fd,0,seek_set) ==-1) perror (argv[0]); if (write (fd,buf1,10)! =) perror (argv[0]); if (Lseek SET) ==-1) perror (argv[0]); if (write (fd,buf2,10)! =) perror (argv[0]); return 0;}
The results of the operation are as follows:

Gcc-o test_createnohole test_createnohole.c./test_createnoholeod-c File.nohole 0000000 a b c d e F g h   I j, \00000020, and so on, and so on, and so on. \0*0040000 A B C D E F G H I J0040012
You can see that the contents of the two files are exactly the same, and then compare the two files.

Ls-ls file.hole file.nohole  8-rw-------1 16394  June 14:53 file.hole20-rw-------1 16394  June 15:19 file. Nohole

It can be found that two files are the same length, but actually occupy disk blocks, no empty files are less.

3.7 The Read function, first look at the function prototype:

#include <unistd.h>ssize_t read (int fd,void* buf,size_t nbytes);
Return value: The read operation starts at the current offset, returns the number of bytes read, returns 0 if the end of the file is reached, and returns 1 if an error occurs. Note here that the Read function has actually read fewer bytes than the number of bytes required to read.

3.8 Write function, or first look at the function prototype and return value:

#include <unistd.h>ssize_t write (int fd,void* buf,size_t nbytes);

Return value: If successful, returns the number of bytes written, and for normal files, the write operation starts at the current offset of the file, and returns 1 if an error occurs. If the return value differs from the parameter nbytes, it indicates an error, indicating that there is data not successfully written at this time. A common cause of write errors is that the disk is full or exceeds the file length limit for a given process. The file length limit here refers to the existence of a "rlimit_fsize" constant in the process that qualifies the maximum byte length of the file that can be created.

The 3.9 section mainly discusses I/O efficiency, so we don't give you any explanation.

3.10 section first introduced the kernel in the I/O data structure used, here to share a blog bar, there are some pictures, I will not steal. Http://www.linuxidc.com/Linux/2015-01/111700.htm

On the basis of the existing data structure, combined with the previously described operation to further explain:

    1. When each wirte is completed, the current file offset in the file table entry increases the number of bytes written. If the current file offset exceeds the length of the file, the current file length is set to the current file offset, or the file is extended.
    2. If a file is opened with the O_append flag, the current file offset is set to the file length before the write operation, and the data to be written is appended to the end of the current file by the above method.
    3. If a file is anchored to the current end of the file with Lseek, the current file offset in the file table entry is set to the file length. This process appears to be the same as the result of the previous step, and is indeed the same in the case of a single process, but in a multi-process scenario, the operation is different. Suppose that there are processes A and B that perform the task flow of "locating to the end of a file by Lseek and then writing" (A, B operations on a file), where the execution of the above tasks occurs due to the scheduling of the process, there will be no problem if the process is fully executed by process A or by process B. However, if first used by a lseek to locate the end of the file, assuming that the file offset is 100, and then B to the end of the file, it is also located at the offset of 100, and then a write, a write is completed by B, when the problem arises, B also from the position of offset 100 is written, The content written by a is also overwritten. However, if the O_append flag is written, before the write operation, the current file offset is set to the file length, then the file offset points to the end of the current file, there is no write overwrite problem.
    4. The Lseek function modifies only the current file offset in the file table entry, without any I/O operations.

Apue also discusses the difference in scope between a file descriptor and a file status flag, which is used only for one descriptor of a process, while the latter applies to all descriptors in any process that point to the given File table entry (there may be different file descriptors in the same process, if different file descriptors share file table entries , these file descriptors share the file status flag, the current file offset, and so on).

Section 3.11 Describes the concepts of atomic operations, and examples of atomic manipulation have been shared in the analysis of the previous section.

Section 3.12 Describes the functions that are used to copy an existing file descriptor. The function prototypes are as follows:

#include <unistd.h>int dup (int fd), int dup2 (int fd,int fd2);
Return value: If successful, returns a new file descriptor, or 1 if an error occurs.

The new file descriptor returned by the DUP must be the minimum numeric value of the currently available file descriptor, and the parameter FD represents the copied descriptor, and the fd_cloexec flag is not shared between file descriptors . For dup2, you can specify the value of the new descriptor with the FD2 parameter, and if FD2 is already open, turn it off first. If FD equals FD2, then Dup2 returns FD2 without closing it. Otherwise (FD is not equal to FD2), FD2 's Fd_cloexec file descriptor flag is cleared, so FD2 is turned on when the process calls exec. First of all, look at the meaning of Fd_cloexec: "Close on exec, not on-fork, meaning if the descriptor is set to Fd_cloexec, the descriptor is closed and can no longer be used in a program executed with EXECL, but in a child process called with fork , this descriptor is not closed and can still be used. However, if the FD is replicated through the DUP2 function, the FD2 fd_cloexec flag bit is cleared, and FD2 is turned on when the process calls exec. Or simply verify by experiment that the following verification procedures come from this article blog:http://blog.csdn.net/ustc_dylan/article/details/6930189

Combine the newly learned to the knowledge verification below, first look at a problematic example:

#include <fcntl.h> #include <unistd.h> #include <stdio.h> #include <string.h>int main (void) {I        NT FD,PID,NEWFD;        Char buffer[20];        Fd=open ("Wo.txt", o_rdonly);        printf ("%d\n", FD);        int Val=fcntl (FD,F_GETFD);        Val|=fd_cloexec;        Fcntl (Fd,f_setfd,val);        Pid=fork ();                if (pid==0) {//Sub-process, this descriptor is not closed, still can use char child_buf[2];                memset (child_buf,0,sizeof (CHILD_BUF));                ssize_t bytes = Read (fd,child_buf,sizeof (CHILD_BUF)-1);                printf ("Child, bytes:%ld,%s\n\n", bytes,child_buf);                Execl execution of the program, this descriptor is closed, no longer use it char fd_str[5]; memset (fd_str,0,sizeof (FD_STR));p rintf ("New FD =%d\n", NEWFD); if (NEWFD = dup2 (FD,NEWFD)) = =-1) perror ("Dup2 error:");                printf ("%d\n", NEWFD);                sprintf (Fd_str, "%d", NEWFD);                int ret = EXECL ("./exe1", "Exe1", fd_str,null);               if ( -1 = = ret)         Perror ("Ececl fail:");        } waitpid (pid,null,0);        memset (buffer,0,sizeof (buffer));        ssize_t bytes = Read (fd,buffer,sizeof (buffer)-1); printf ("Parent, bytes:%ld,%s\n\n", Bytes,buffer);}

The results of the operation are as follows:

3child, Bytes:1,texe1:read fail:: Bad file descriptorparent, Bytes:14,his is a test
NEWFD in the above program is not initialized, it is coincidentally, in the child process is initialized to 0 (here to save the next question, each run is 0, the temporary variable without initialization should be a random value). Since my newfd initial value is 0, according to the function of the DUP2 function: "If the FD2 has been opened, first close it", then the file descriptor 0 is closed. The DUP2 function is then called to clearfd_cloexecThe flag bit, after calling the EXECL function, uses the file descriptor 0 that is already open. The results of the operation are as follows:

3child, BYTES:1,TNEWFD = 0exe1:read 14,his is a testparent, bytes:0,

If the NEWFD is set to 3, the DUP2 function returns FD2 (here 3, while open), but because there is no clearfd_cloexecFlag bit, the file descriptor 3 is closed when the call to the EXECL function is called.

Change to 4, and the program works again. If you use the DUP function instead, the NEWFD value is 4, and the program runs correctly.






Apue Reading notes-Chapter III file I/O

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.