Six-Star Classic csapp-Note (10) system IO

Source: Internet
Author: User
Tags file copy rewind

Six Star Classic csapp-notes (10) system i/o1. Unix I/O

The runtime system for all languages provides a high level of abstraction for I/O operation functions. For example, ANSI C provides functions for I/O buffering functions such as printf and scanf in the standard I/O library; C + + is overloaded with << and >> to support read and write. In UNIX systems, these high-level functions are based on UNIX system I/O functions, and most of the time we do not need to use the underlying UNIX I/O directly. But learning Unix system I/O can better understand some of the system concepts, and when high-level functions do not work, we can easily implement the desired functionality, such as accessing the file's metadata.

Unix/linux a uniform and elegant abstraction of various I/O devices into a file, which provides a very simple, low-level API interface, UNIX I/O. This set of APIs consists mainly of the following five functions:

    • Open (): The request kernel opens a file so that the application code can access the file's corresponding I/O device. The kernel returns a non-negative integer called Descriptor descriptor. The kernel maintains a set of data structures for open file (learning about the Linux kernel), and you can think of the descriptor as the ID of file in the kernel, which is required for all subsequent operations.
    • Lseek (): Modifies the current file's operation position K, which is the byte offset from file start. K is automatically moved when the file is read or written, or Lseek () move K can be explicitly called.
    • Read (): Starting from position K, the file copy is greater than 0 bytes to memory. When k exceeds the total byte size of the file, a condition is triggered end-of-file(EOF) .
    • write (): Similarly, from a memory copy greater than 0 bytes to the location of file K, and update K.
    • Close (): Notifies the kernel to release file's corresponding data structure in the kernel and puts the descriptor back into the resource pool. At the end of the process, the kernel automatically closes all open file.

What is EOF exactly?
First it must be clear that there is no EOF character for this kind of thing. EOF is a state or condition that is detected by the operating system kernel to achieve this condition. When the application reads from read () to 0 o'clock, it indicates that the EOF condition is reached. For disk file,eof, the location K exceeds the file size. For a network connection file,eof means that one end of the connection closes the connection and the other end detects EOF.

2. Open the Close file

Learn more about the open () and close () functions below:

#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>/** Returns: new file descriptor if OK, ?1 on error */int open(charint flags, mode_t mode);#include <unistd.h>/** Returns: zero if OK, ?1 on error */int close(int fd);

The flags parameter indicates how you want to access the file, which can be composed of several patterns:

    • O_rdonly: Read-only
    • O_wronly: Write only
    • O_RDWR: Read and Write
    • O_creat: File does not exist then create
    • O_trunc: File exists then truncate it
    • O_append: Before writing the file, Position K is set to the end, i.e. append mode

The mode parameter indicates the access rights for the new file , which can be omitted. Note that each process has one set by the Umask () function umask , and the open () function eventually creates a file that has access rights mode & ~umask :

    • S_irusr/s_iwusr/s_ixusr:owner read, write, and Execute permissions
    • Read, write, execute permissions for members in the S_irgrp/s_iwgrp/s_ixgrp:owner group
    • S_iroth/s_iwoth/s_ixoth: Read, Write, execute permissions for any other user
3. Read and Write Data

Read () and write () copy up to n bytes, return value: 1 for error, 0 for EOF, and greater than 0 to indicate the number of bytes actually copied.

#include <unistd.h>/** Returns: number of bytes read if OK, 0 on EOF, ?1 on error */ssize_t read(intvoid *buf, size_t n);/** Returns: number of bytes written if OK, ?1 on error */ssize_t write(intconstvoid *buf, size_t n);

What's the difference between ssize_t and size_t?
size_t is defined as unsigned int, and ssize_t is defined as int. Because read () and write () return 1 to indicate that an error occurred, use ssize_t as the return value.

#include <unistd.h>int main(intcharconst *argv[]){    char c;    while10) {        1);    }    return0;}

In the following cases, read () and write () may return less than n bytes:

    • there is no data to read : If we read with a 50-byte chunk and only 20 bytes are readable, the call to read () will encounter EOF.
    • read from terminal : Only one line of text can be read to the terminal at a time.
    • read/write sockets: Buffer limits and network latencies inside the socket can result in only a portion of the data being read. Similarly, the process of UNIX pipe, IPC and other read and write the same situation.

And when you read the data from the hard drive, unless you read to the end of the trigger EOF, you will never encounter a situation where the return is less than n bytes . This is never the case when writing data as well.

4. File meta-data

You can obtain metadata for a file by using a filename or descriptor.

#include <unistd.h>#include <sys/stat.h>/** returns:0 If OK,? 1 On Error * /intStatConst Char*filename,structStat *buf);intFstat (intFdstructStat *buf);/ * Metadata returned by the stat and FSTAT functions * /structStat {dev_t St_dev;/ * Device * /ino_t St_ino;/ * Inode * /mode_t St_mode;/ * Protection and file type * /nlink_t St_nlink;/* Number of hard links * /uid_t St_uid;/ * User ID of owner * /gid_t St_gid;/ * Group ID of owner * /dev_t St_rdev;/* Device type (if Inode device) */off_t st_size;/ * Total size, in bytes * /    unsigned LongSt_blksize;/* Blocksize for filesystem I/O */    unsigned LongSt_blocks;/* Number of blocks allocated * /time_t St_atime;/* Time of last access * /time_t St_mtime;/* Time of last modification * /time_t St_ctime;/* Time of last change * /};

For example, here is a small example that combines file open close and metadata reading, first create a new file, and then use the Fstat () function to view the newly created file's access rights and print it in a format similar to the LS command:

//printf, GetChar#include <stdio.h>//malloc#include <stdlib.h>//Open, mode_t#include <fcntl.h>#include <sys/types.h>#include <sys/stat.h>//Close#include <unistd.h>mode_t Getumask ();Char*MODE2STR (mode_t mode);intMainintargcChar Const*argv[]) {intFdstructStat stat;/ * * 1.Print default umask. * The typical default value for the process umask is: * s_iwgrp | S_iwoth (octal 022) */         printf("Umask:%s\n", Mode2str (Getumask ()));//2.Create a new file    if(FD = open ("Foo.txt", O_wronly | O_creat, S_IRUSR | S_IWUSR)) = =-1) {fprintf(stderr,"Create file failed\n");Exit(1); }//3.Check if new file Mode = mode & ~mask    if(Fstat (fd, &stat) = =0)printf("file mode:%s\n", Mode2str (Stat.st_mode));Else        printf("Get metadata failed\n");//4.Close fileClose (FD);return 0;}/** * There is a getumask (), but only specified in Glib_c *, not portable. * @return Current umask * *mode_t Getumask () {mode_t mode = Umask (0); Umask (mode);returnmode;}/** * acted like ' ls-l ' * @param mode file mode * @return human-read */Char*MODE2STR (mode_t mode) {Char*str =malloc(Ten*sizeof(Char));inti =0; str[i++] = S_isreg (mode)?' F ':' D '; str[i++] = mode & S_IRUSR?' R ':'-'; str[i++] = mode & S_IWUSR?' W ':'-'; str[i++] = mode & S_IXUSR?' X ':'-'; str[i++] = mode & S_IRGRP?' R ':'-'; str[i++] = mode & S_IWGRP?' W ':'-'; str[i++] = mode & S_IXGRP?' X ':'-'; str[i++] = mode & S_iroth?' R ':'-'; str[i++] = mode & S_irwxo?' W ':'-'; str[i++] = mode & S_ixoth?' X ':'-';returnSTR;}
5. File sharing

UNIX has many ways to share files, before learning to learn what data structures the UNIX kernel maintains for each file. It is the existence of these data structures that creates a variety of shared ways:

    • Descriptor Tables (descriptor table): Each process has a separate descriptor table, and each entry holds a descriptor for open file and a pointer to the file tables.
    • File Table: The file table is shared by all processes, each entry holds the current position, the number of descriptors that point to the file (possibly multiple), and a pointer to the V-node table. When the number of references is reduced to 0 o'clock, the kernel deletes the file table entry.
    • v-node tables (v-node table): V-node tables are also shared by all processes, and each entry holds Meta data stat information.

Here's a look at the three most common cases.

5.1 Not shared

5.2 The same file is opened multiple times

5.3 Child process Inherits parent process opened file

Here is a small example where Open_twice () attempts to open a file two times and then reads a character from the file with two times the resulting descriptor reading ' F ', stating that the V-node table entry is the same, but that the file table has not been shared . and INHERIT_PARENT_FD () is to open the file, fork a child process, the child process first read a character, the parent process, such as the end of a child process to read a character, the result is read ' O ', indicating that the file table sharing, The read of the child process causes the position K to shift .

//printf#include <stdio.h>//Exit#include <stdlib.h>//Open, Seek#include <fcntl.h>//Read, write#include <unistd.h>//Wait#include <wait.h>voidOpen_twice (Char*filename);voidINHERIT_PARENT_FD (Char*filename);intMainintargcChar Const*argv[]) {Open_twice ("Foo.txt"); INHERIT_PARENT_FD ("Foo.txt");return 0;}voidOpen_twice (Char*filename) {intFD1, FD2;CharC    FD1 = open (filename, o_rdonly);    FD2 = open (filename, o_rdonly); Read (FD1, &c,1); Read (FD2, &c,1);printf("C =%c\n", c);}voidINHERIT_PARENT_FD (Char*filename) {intFdCharC FD = open (filename, o_rdonly);if(fork () = =0) {Read (FD, &c,1);Exit(0);    } wait (NULL); Read (FD, &c,1);printf("C =%c\n", c);}
6. Standard i/o6.1 standard I/O Library

ANSI c defines a set of more levels of I/O functions, called Standard I/O libraries (libc), providing an alternative to UNIX C:

    • Open Close : fopen () and fclose ()
    • Read and Write sections : Fread () and fwrite ()
    • read-write characters : fgets () and fputs ()
    • format I/O: scanf () and printf ()

The standard I/O Library abstracts open file as stream. For programmers, a stream is a pointer to a file type. The stream or file type, in fact = file descriptor + Buffer, is designed to maintain a buffer internally, avoiding frequent calls to expensive systems. In addition, each ANSI C program starts automatically with three streams, corresponding to standard input, output, and error flow:

#include <stdio.h>extern FILE *stdin;     /* Standard input (descriptor 0) */extern FILE *stdout;    /* Standard output (descriptor 1) */extern FILE *stderr;    /* Standard error (descriptor 2) */

It is also recommended that most C programmers use only standard I/O libraries throughout their career, rather than using the underlying UNIX I/O directly. However, the standard I/O library has some limitations when it handles network full-duplex communication:

    • Limit 1: After calling the output function, call Fflush,fseek,fsetpos or rewind to call the input function . Because it is illegal to use lseek with a socket, it is possible to resolve this issue by calling Fflush Reset read location before each call to input.
    • limit 2: After calling the input function, call fseek,fsetpos or rewind to invoke the output function, otherwise you will encounter EOF. This issue can only be resolved by opening two file for both read and write.
6.2 I/O redirection

The Unix shell provides I/O redirection operators, such as unix> ls > foo.txt. This function is done by the dup2 () function. Dup2 will copy the OLDFD descriptor entry cover to NEWFD entry, if the previous NEWFD is open, then Dup2 will first close NEWFD and then start copying. For example, the default stdout corresponds to FD1, assuming foo.txt corresponds to FD4, then dup2 (4, 1) Overwrites FD4 Descriptor table entry to FD1, so eventually stdout will be the same as FD4, pointing to the File table entry for the foo.txt files.

#include <unistd.h>/** Returns: nonnegative descriptor if OK, ?1 on error */int dup2(intint newfd);

Six-Star Classic csapp-Note (10) system IO

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.