1. c Standard library I/O Buffer
The UNIX tradition is everything is a file. The keyboard, display, serial port, disk, and other devices all have a special device file under the/dev directory, these device files can be opened, read, written, and closed just like normal files (files stored on disks). The function interfaces are the same. User Programs call C standard I/O library functions to read and write common files or devices. These library functions send read and write requests to the kernel through system calls, the kernel drives the disk or device to complete I/O operations. The C standard library allocates an I/O buffer for each opened file to accelerate read/write operations. This buffer can be found through the file structure of the file, most of the time a user calls a read/write function, the read/write requests are read and written in the I/O buffer. Only a few requests need to be sent to the kernel. Use fgetc
/Fputc is used as an example. When a user program calls fgetc to read a byte for the first time, the fgetc function may use the system call to read 1 K bytes to the I/O buffer through the kernel, then, the first byte in the I/O buffer is returned to the user, and the read/write position is directed to the second character in the I/O buffer. Then, the user calls fgetc, it is read directly from the I/O buffer, instead of the kernel. When the user reads all the 1 K bytes and calls fgetc again, the fgetc function will read 1 K bytes into the I/O buffer again. In this scenario, the relationship between the user program, the C standard library, and the kernel is like the relationship between CPU, cache, and memory in "Memory Hierarchy, why does the C standard library pre-read some data from the kernel?
In the I/O buffer, the user program needs to use the data later. The I/O buffer of the C standard library is also in the user space, reading data directly from the user space is much faster than reading data into the kernel. On the other hand, when a user program calls fputc, it is usually written to the I/O slow-forward zone, so that the fputc function can return quickly. If the I/O buffer is full, fputc transmits the data in the I/O buffer area to the kernel through a system call, and the kernel finally writes the data back to the disk or device. Sometimes the user program wants to put the data in the I/O buffer immediately
It is passed to the kernel so that the kernel can be written back to the device or disk. This is called the flush operation. The corresponding database function is fflush. The fclose function will also perform the flush operation before closing the file.
We know that the main function is called as follows by the startup code: exit (main (argc, argv ));.
When the main function returns, the startup code will call exit. The exit function first closes all file * pointers that have not been closed (the flush operation is required before closing ), then, the kernel exits the current process by calling the _ exit system.
There are three types of I/O buffers in the c Standard Library: Full buffer, row buffer, and no buffer. When a user program calls a library function for write operations, different types of buffers have different features.
Full Buffer
If the buffer is full, write it back to the kernel. Regular files are generally fully buffered.
Row Buffer
If there is a line break in the data written by the user program, write this line back to the kernel, or if the buffer is full, write it back to the inner core. Standard input and standard output are usually used as row buffering for terminal devices.
No Buffer
Each time a user program calls a database function for write operations, it must be written back to the kernel through system calls. Standard Error output is usually unbuffered, so that the error messages generated by the user program can be output to the device as soon as possible.
In addition to writing full buffers and line breaks, row buffering also automatically performs flush operations. If:
User Programs call library functions to read from unbuffered files
Or read from the buffer file, and this read operation will cause the system to call to read data from the kernel
If your program does not want to rely entirely on the automatic flush operation, you can call the fflush function to manually perform the flush operation.
# Include <stdio. h>
Int fflush (File * stream );
Returned value: 0 is returned for success, EOF is returned for error, and errno is set
The fflush function is used to ensure that data is written back to the kernel to avoid data loss during abnormal Process Termination, such as fflush (stdout). As a special case, call fflush (null) you can flush all the I/O buffers that open files.
2. user program buffer
The buffer allocated on the function stack, such as char Buf [10];, strcpy (BUF, STR); the string pointed to by STR may exceed 10 characters, leading to overwrite, this write out-of-bounds error may not occur at the time, but a segment error occurs when the function returns because the write out-of-bounds overwrites the return address stored on the stack frame, and jumps to an invalid address when the function returns, therefore, an error occurs. A memory segment, such as Buf, allocated by the caller and passed to the function for reading or writing is often referred to as a buffer. The buffer overflow error is called as a buffer overflow ). If only
The current segment error is not serious. What's more serious is that the buffer overflow bug is often exploited by malicious users, so that when the function returns, it jumps to a preset address and executes the preset command, if it is cleverly designed, you can even start a shell and execute any command as you like, it is conceivable that if a program executed with root permission has such a bug, it will cause serious consequences.
Fgets/fputs indicates the role of the I/O buffer. When using the fgets/fputs function, you also need to allocate a buffer area (buf1 and buf2 in the figure) in your program ), note that the user program buffer and the C standard library I/O buffer are distinguished.
3. Kernel Buffer
(1) terminal Buffering
The terminal device has an input and output queue buffer, as shown in
Take the input queue as an example. After the characters entered by the keyboard are filtered by line rules, the user program reads the characters from the queue in the FIFO order. Generally, when the input queue is full, the entered characters will be lost, and the system will trigger an alarm. The terminal can be configured to echo mode. In this mode, each character in the input queue is sent to both the user program and the output queue. Therefore, when you type a character in the command line, this character can not only be read by the program, but also be displayed on the screen.
Note that the above is the case where the user process (the shell process is also) calls unbuffer I/O functions such as read/write. When printf/scanf is called (the underlying implementation is also read/write) when a user program calls scanf to read the keyboard input, all the characters that start to be input are stored in the I/O buffer of the C standard library, when we encounter a line break (both standard input and standard output are row-buffered), the system calls read to read the buffer content to the kernel terminal input queue. When printf is called to print a string, if the statement contains a line break, the string called write in the I/O buffer is immediately written to the output queue of the kernel and printed to the screen. If the printf statement does not contain a line break, the above discussion shows that the fflush operation will be performed when the program exits.
(2) Although the write system calls are located at the bottom layer of the C standard library I/O buffer, it is called the unbuffered I/O function, however, a kernel I/O buffer can be allocated at the underlying layer of the write operation. Therefore, the write operation may not be directly written to a file, but may also be written to the kernel I/O buffer, you can use the fsync function to synchronize files to the disk. As to whether the files are written or the kernel buffer, there is no difference for the process. If process a and process B open the same file, data written by process a to the kernel I/O buffer can also be read from process B, because the kernel space is shared by the process,
The I/O buffer of the C standard library does not have this feature, because the user space of the process is completely independent.
(3) to reduce the number of disk reads, the kernel caches the tree structure of the Directory, which is called the dentry (directory entry) cache.
(4) both the FIFO and Unix domain socket IPC Mechanisms are identified by special files in the file system. A fifo file has no data blocks on the disk and is used to identify only one channel in the kernel. Each process can open this file for read/write, in fact, it is the reading and writing kernel channel (the root cause is that the Read and Write Functions pointed to by this file structure are different from those of conventional files), thus implementing inter-process communication. UNIX domain
The principle of socket and FIFO is similar. A special socket file is also required to identify the channel in the kernel. The file type s indicates socket, and these files do not have data blocks on the disk. UNIX domain socket is currently the most widely used IPC Mechanism. For example:
4. Infinite recursion of stack overflow or a defined large array may cause the program to crash (segment error) when the stack space reserved by the operating system for the program is exhausted)
Reference: Linux C Programming one-stop learning (open source books)