Tenth. System-Level i/o10.1 UNIX I/O
1: System level I/O
What is UNIX I/O?
All I/O devices, such as networks and disks, are modeled as files, and all inputs and outputs are executed as read and write to the corresponding file. This is how the device is mapped to a file,
Allows the Unix kernel to elicit a simple, low-level application interface called Unix I/O.
2: Input/Output I/O is the process of copying data between main memory and external devices such as disks, networks, and terminals.
Input is the copying of data from an I/O device to main memory, and the output is copying data from main memory to I/O devices.
3: Open File: An application declares that it wants to access an I/O device by requiring the kernel to open the appropriate file. The kernel returns a small non-negative integer called the descriptor, which is subsequently
This file is identified in all operations. The kernel records all information about this open file, and the header file can be used instead of an explicit descriptor.
Each process created by the Unix shell starts with three open files: a standard input (descriptor 0), standard output (descriptor 1), and a standard error (descriptor 2).
4: Change the current file location: For each open file, the kernel maintains a file location K, initially 0. This file location is the byte offset starting at the beginning of the file.
The application is able to explicitly set the file's current location to K by performing a seek operation.
3: Read and Write files: A reading is to copy n bytes from a file to memory; Given a file of size M bytes, a condition called (EOF) is triggered when k is greater than or equal to M.
A write operation is a copy of n bytes from memory to a file.
4: Close the file, the kernel frees the file open is the data structure created and restores the descriptor to the descriptor pool.
10.2 Opening and closing files
The process is to open an existing file or create a new file by calling the Open function.
The flags parameter indicates how the process intends to access the file, and its value includes: o_rdonly
O_wronly
O_rdwr
The flags parameter can also be one or more bit masks or provide some additional instructions:
O_creat
O_trunc: If the file already exists, truncate it.
O_append
The mode parameter specifies the access permission bit for the new file. Symbolic names such as. As part of the context, each process has a umask that is set by calling the Umask function. When a process is called by the Open function with a mode parameter to create a new file, the access permission bit of the file is set to Mode&umask.
10.3 Reading and writing files 598
An application performs input and output by invoking the system function read and write functions separately.
Side note: size_t is as usigned int, and ssize_t is as int.
In some cases, read and write transmit fewer bytes than the application requires. Possible causes of this situation are:
Encountered EOF while reading. Assuming that the file starts at the current file location with only 20 bytes, and the application requires that we read with a 50-byte slice, this read returns a value of 20, and after that, read returns 0.
Reads a line of text from the terminal. If the open file is associated with a terminal, each read function transmits one line of text at a time, and the value returned is equal to the size of the text line.
Read and write sockets. If the open file corresponds to a network socket, the internal buffering constraints and the longer network latency cause read and write to return insufficient values.
10.4 Robust reading and writing with the Rio packet 599
Rio provides two different types of functions:
unbuffered input and OUTPUT functions
Input function with buffering
10.4.1 Rio's unbuffered input-output function 600
- The RIO_READN function transmits a maximum of n bytes from the current file position of the descriptor FD to the memory location USRBUF. A similar Rio_writen function transmits n bytes from the position usrbuf to the descriptor FD. The RIO_READN function can only return an insufficient value when it encounters EOF. The Rio_writen function will never return an insufficient value.
- Note: If the RIO_READN and Rio_writen functions are interrupted by a return from the application signal handler, then each function will manually restart read or write.
10.4.2 Rio's buffered input function 600
A line of text is a sequence of ASCII characters that ends with a newline character. In Unix systems, the newline character is ' \ n ', the same as the ASCII newline character LF, and the value is 0x0a. Suppose we're going to write a program that calculates the number of Chinese lines in a text file.
One way is to use the Read function to transfer from file to user memory one byte at a time, checking each byte to find a newline character. The problem with this approach is that efficiency is not high, and each fetch of a byte in a file requires a kernel.
A better approach is to call a wrapper function (RIO_READLINEB), which copies a line of text from an internal buffer, and automatically calls the read system call to refill the buffer when the buffer becomes empty.
- In a version with buffers, the RIO_READINITB function is called once per open descriptor, which links the descriptor FD to a read buffer of type rio_t at the address Rp.
- The RIO_READINITB function reads a line of text (including the trailing newline character) from the file RP, copies it to the memory location usrbuf, and ends the line of text with a null character.
- The core of the Rio reading program is the Rio_read function, which can be seen as a buffer version of the UNIX read function. When calling Rio_read requires reading n bytes, there are rp->rio_cnt unread bytes in the read buffer. If the buffer is empty, the read system function is called to fill the buffer. This read call is not an error if it receives an insufficient value, except that the read buffer is partially populated.
- Once the buffer is non-empty, Rio_read copies N and rp->rio_cnt from the read buffer to the user buffer, and returns the number of copy bytes.
- For applications, Rio_read and system invoke read have the same semantics. Returns 1 On Error, 0 on EOF, or an insufficient value if the requested byte exceeds the number of unread bytes in the read buffer. The RIO_READLINEB function calls the Rio_read function multiple times. Each call returns a byte from the read buffer, and then checks to see if the byte is the end of the line break.
The Rio_readlineb function reads a maximum of (maxlen-1) bytes, leaving the remaining byte at the end of the null character. Lines of text that exceed maxlen-1 bytes are truncated and end with a null character.
10.5 Reading file meta Data 604
10.6 Sharing Files 606
The kernel uses three related data structures to represent open files
Descriptor descriptor
File table
V-node table
10.7 I/O redirection 608
- The Unix shell provides an I/O redirection operator that allows users to link disk files to standard input and output.
How I/O redirection works: One is to use the DUP2 function.
The DUP2 function copies the descriptor table entry OLDFD to the Descriptor table entry NEWFD, overwriting the previous contents of the Descriptor table entry NEWFD. If the NEWFD is already open, Dup2 will close NEWFD before copying the OLDFD.
10.8 Standard I/O 609
- ANSI c defines a set of advanced input and output functions as standard I/O libraries, providing programmers with a higher-level alternative to UNIX I/O. This library (LIBC) provides functions for opening and closing files (fopen and fclose), functions for reading and writing sections (Fread and fwrite), functions for reading and writing strings (Fgets and fputs), and complex formatted I/O functions (printf and scanf).
The standard I/O library models an open file as a stream. For programmers, a stream is a pointer to a structure of type file. Each ANSI C program starts with three open streams stdin, stdout, and stderr, which correspond to standard input, standard output, and standard errors, respectively:
- 10.9 synthesis: Which I/O functions should I use 610
The various I/O packages discussed in this chapter
- The standard I/O stream is, in a sense, full-duplex, because the program can perform input and output on the same stream.
It is recommended that you do not use standard I/O functions for input and output on network sockets. Instead, use the robust Rio function.
10.10 Summary 611
- UNIX provides a small number of system-level functions that allow applications to open, close, read and write files, extract metadata for files, and perform I/O redirection. UNIX read and write operations can have insufficient values, and the application must be able to correctly anticipate and handle this situation. Instead of calling the UNⅨI/O function directly, the application should use the Rio package, which automatically handles the insufficient values by repeatedly performing read and write operations until all of the request data is delivered.
- The UNIX kernel uses three related data structures to represent an open file. The table entries in the Descriptor table point to the table entries in the open file, and the table entries in the Open File table point to the table entries in the V-node table, each process has its own separate descriptor table, and all processes share the same open and V-node tables. Understanding the general composition of these structures allows us to clearly understand file sharing and I/O redirection.
- The standard I/O library is based on Unix I/O and provides a powerful set of advanced I/O routines that, for most applications, are simpler and better than the UNIX I/O option. However, because of some incompatibilities with standard I/O and network files, Unix I/O is more suitable for network applications than standard I/O.
Information Security System Design Foundation 9th Week study Summary