System-Level I/O
10.1 Unix I/O
(1) A UNIX file is a sequence of M bytes: b0,b1,b2,b3 ... Bk... Bm-1.
(2) All I/O devices, such as networks, disk box terminals, are modeled as files, and all inputs and outputs are executed as read and write to the corresponding file. This is an application interface that becomes UNIX I/O.
-打开文件:一个应用程序通过要求内核打开相应地文件,来宣告它想要访问一个I/O设备。内核返回一个小的非负整数,叫做描述符。它在后续对此文件的所有操作中标识这个文件。内核记录有关这个打开文件的所有信息。应用程序只需要记住这个标识符。
-Unix外壳穿件的每个进程开始时都有三个打开的文件:标准输入(描述符为0)、标准输出(描述符为1)、标准错误(描述符为2)。头文件<unistd.h>定义了常量STDIN_FILENO、STDOUT_FILENO、STDERR_FILENO,它们而已用来代替显式的描述符值。
-改变当前的文件位置。对于每个打开的文件,内核保持着一个文件位置k,初始为0。这个文件位置是从文件开头起始的字节偏移量。应用程序能够通过执行seek操作,显式地设置文件的当前位置为k。
-读写文件。一个读操作就是从文件拷贝n>0个字节到存储器,从当前文件位置k开始,然后将k增加到k+n。给定一个大小为m字节的文件,当k>=m时执行读操作会触发一个称为end-of -file(EOF)的条件,应用程序能检测到这个条件。在文件结尾处并没有明确的“EOF”符号。
类似的,写操作就是从存储器拷贝n>0个字节到一个文件,从当前文件位置k开始,然后更新k。
-关闭文件。当应用完成了对文件的访问之后,它就通知内核关闭这个文件。作为响应,内核释放文件打开时创建的数据结构,并将这个描述符恢复到可用的描述符池中。无论一个进程因为何种原因终止时,内核都会关闭所有打开的文件并释放他们的存储器资源。
10.2 Opening and closing files
(1) The Open function converts the filename to a file descriptor and returns a descriptor number. The returned descriptor is always the smallest descriptor that is not currently open in the process. The flags parameter indicates how the process intends to access the file.
O_rdonly: Read only.
O_wronly: Write only.
O_RDWR: Readable and writable.
The flags parameter can also be one or more bits of the mask or, providing some additional instructions for writing:
O_creat: If the file does not exist, create a truncated (empty) file for it.
O_trunc: If the file already exists, truncate it.
(2)-进程是通过调用open函数来打开一个已存在的文件或者创建一个新文件的
#include <sys/types.h><br> #include <sys/stat.h> <br> #include <fcntl.h>
int open (char *filename, int flags, mode_t mode),//returns:new file descriptor if ok,−1 on error
-Close File
#include <unistd.h>
int close (int fd);
-Read files
ssize_t Read (int fd, void *buf, size_t n);//returns:number of bytes read if OK, 0 on eof,−1 On error
-Reliable Reading
ssize_t rio_readn (int fd, void *usrbuf, size_t N) {
size_t nleft = n;
ssize_t nread;
char *BUFP = usrbuf;
while (Nleft > 0) {
if ((nread = Read (FD, BUFP, nleft)) < 0) {
}
if (errno = = eintr)
nread = 0;
Else
return-1;
}
else if (nread = = 0)
Break
Nleft-= nread;
BUFP + = Nread;
}
return (N-nleft);
}
-Write files
#include <unistd.h>
ssize_t Write (int fd, const void *buf, size_t n);//returns:number of bytes written if ok,−1 On error
-Reliable Writing
ssize_t rio_writen (int fd, void *usrbuf, size_t N)
size_t nleft = n;
ssize_t Nwritten;
char *BUFP = usrbuf;
while (Nleft > 0) {
if ((Nwritten = Write (fd, BUFP, nleft)) <= 0) {
if (errno = = eintr)/* Interrupted by SIG Handler return */
Nwritten = 0;
Else
return-1;
}
Nleft-= Nwritten;
BUFP + = Nwritten;
}
return n; }
10.3 Reading and writing files
(1) The application performs input and output by invoking the read and write functions, respectively.
#include <unistd.h>
ssize_t read(int fd,void *buf,size_t n);//返回值:成功为读的字节数,若EOF为0,出错为-1
ssize_t write(int fd,const void *buf,size_t n);
//
返回值成功为写的字节数,出错为-1(2) The Read function copies a maximum of n bytes from the current file position of FD to the memory location buf, and a return value of 1 indicates an error. The return value of 0 means EOF. Otherwise, the return value represents the actual number of bytes transferred(3) The Write function copies the current file position of up to n bytes to the descriptor FD from the memory location BUF(4) In some cases, read and write transmit fewer bytes than the application requires, and these insufficient values do not indicate an error. The reasons are as follows:
-读时遇到EOF。假设我们猪呢比读一个文件,该文件从当前文件位置开始只含有20多个字节,而我们以50个字节的片进行读取。这样一来,下一个read返回的不足值为20,此后的read将通过返回不足值0来发出EOF信号。
-从终端读文本行。如果打开文件是与终端相关联的(如键盘和显示器),那么每个read函数将以此传送一个文本行,返回的不足值等于文本行的大小。
-读和写网络套接字。如果打开的文件对应于网络套接字,那么内部缓冲约束和较长的网络延迟会引起read和write返回不足值。对Unix管道调用read和write时,也有可能出现不足值,这种进程间的通信机制不在我们讨论的范围之内。
实际上,除了EOF,在读磁盘文件时,将不会遇到不足值,而且在写磁盘文件时,也不会遇到不足值。如果想创建简装的诸如web服务器这样的网络应用,就必须通过反复调用read和write处理不足值,直到所有需要的字节都传送完毕。
10.4 Robust read and write with Rio packet
(1) The Rio package will automatically handle the insufficient value. Rio provides two different types of functions:
-unbuffered input and output functions. These functions transmit data directly between the memory and the file, without application-level buffering, and they are particularly useful for reading and writing binary data to and from the network.
-Input function with buffering. These functions allow you to efficiently read lines of text and binary data from a file that is cached in an application-level buffer, similar to a buffer provided by standard I/O functions like printf. is thread-safe and can be called interleaved on the same descriptor. For example, you can read some lines of text from a descriptor and then read some binary data,
(2) Non-buffered input and output function for Rio
By invoking the RIO_READN and Rio_writen functions, the application can transfer data directly between the memory and the file.
(3) The buffered input function of Rio
A line of text is a sequence of ASCII characters that ends with a newline character. In Unix systems, the newline character (' \ n ') is the same as the ASCII line break (LF) and the numeric value is 0x0a.
Use a program to calculate the number of Chinese lines in a text file: Use the Read function to transfer from file to user memory one byte at a time, examining each byte to find a newline character. The disadvantage of this approach is that it is inefficient and requires a kernel to be trapped in every byte of the read file.
(4) Rio reading program core: Rio-read function
Static ssize_t Rio_read (rio_t *rp,char *usrbuf,size_t N)
{
int cnt;
while (rp->rio_cnt<=0)//If buffer is empty, call function to fill buffer reread data first
{
Rp->rio_cnt=read (rp->rio_fd,rp->rio_buf,sizeof (RP->RIO_BUF));//Call the Read function to fill the buffer
if (rp->rio_cnt<0)//Exclude file cannot read data condition
{
if (Error! = eintr)
{
return-1;
}
}
else if (rp->rio_cnt=0)
return 0;
Else
rp->rio_bufptr = rp->rio_buf;//Update where it is now read
}
Cnt=n;
if (rp->rio_cnt<n)
cnt=rp->rio_cnt;//above three steps to assign the smaller values of N and rp->rio_cnt to CNT
memcpy (usrbuf,rp->rio_bufptr,cnt); Copy the contents of the read buffer to the user buffer
rp->rio_bufptr+=cnt;
rp->rio_cnt-=cnt;
return CNT;
}
10.5 Read file meta data
(1) The application can retrieve information about the file (metadata) by invoking the stat and FSTAT functions
#include <unistd.h>
#include <sys/stat.h>
int stat (cost char *filename,struc sta *buf);
int fstat (int fd,struct stat *buf);
(2) file type
-Normal file: Binary or text data, macro directive: S_isreg ()
-Catalog file: Information containing other files, macro directive: S_isdir ()
-Sockets: Files that communicate over the network and other processes, macros: S_issock ()
10.6 Sharing Files
(1) The kernel represents an open file with three related data structures:
-Descriptor tables: each Open Descriptor table entry points to a list item in the File table
-File table: All processes share this table, and each table entry includes the file location, reference count, and a pointer to the table entry for the V-node table.
-v-node table: All processes share this table, containing most of the information in the stat structure
(2) Types of three open files
-Typical: Descriptors each refer to different files without sharing
-Shared: Multiple descriptors refer to the same file through different file table entries. (Key idea: Each descriptor has its own file location, the read operation of the different descriptors can get data from different locations of the file)
-Inherit: The child process inherits the parent process to open the file. After the fork is called, the child process has a copy of the parent Process descriptor table, and the parent-child process shares the same set of open file tables, thus sharing the same file location
Picture Reference p607
10.7 i/0 redirect
(1) DUP2 function Copy Descriptor table entry OLDFD to Descriptor table entry NEWFD, overriding description table entry NEWFD previous content
(2) If NEWFD is already open, Dup2 will close newfd before copying OLDFD
10.8 Standard I/O
(1) In many ways, using standard I/O libraries is similar to using I/O without caching. You need to open a file to establish an access path first. The return value of this operation will be used as a parameter to other standard I/O library functions. In the standard I/O library, the peer called Stream, which corresponds to the underlying file descriptor, is implemented as a pointer to the structure file.
(2) When starting the program, there are three file streams that are automatically opened. They are stdin, stdout and stderr. They are all defined in the Stdio.h header file, representing standard input, standard output, and standard error output, corresponding to the underlying file descriptor 0, 1, and 2.
Problems encountered
-Read and reliably read the difference where, as well as write and reliably write
What's the difference between-ssize_t and size_t?
Resources
-In-depth understanding of computer systems
-Summary of Unix I/O in the blog Park http://www.cnblogs.com/shangdahao/archive/2013/04/14/3019461.html
-Wu Ziyi Student's Blog
The eighth week summary of the Design foundation of information security system