Unbuffered I/O. buffered I/O

Source: Internet
Author: User
1. buffered I/O, that is, standard I/O
First, it should be clear that unbuffered I/O is only relative to buffered I/O, that is, standard I/O.
Instead of using unbuffered I/O to read and write data to a disk, there is no buffer. In fact, the kernel is using a high-speed buffer.
Real disk read/write, but the buffer discussed here is irrelevant to the buffer in the kernel.
What is the purpose of buffered I/O?
Buffered I/O is designed to improve efficiency.
Please clarify a relationship, that is,

Buffered I/O library functions (fread, fwrite, etc., user space) <---- call ---> unbuffered I/O system calls (read, write, and so on, kernel space) <-------> read/write Disk
Buffered I/O library functions are implemented by calling the relevant unbuffered I/O system calls. They do not directly read or write data to disks.
So where does the improvement come from?
Note that buffered I/O is a database function, while unbuffered I/O is called by the system. The efficiency of using the database function is higher than that of using the system call.
Buffered I/O improves the efficiency by using as few system calls as possible.
The basic method is to maintain a buffer in the user's process space. When reading (Database Function) for the first time, read (called by the System) more data from the kernel,
The next time you want to read (Database Function) data, read it from the buffer, instead of reading it again (called by the system.
Similarly, when writing data, the data is first written into a buffer zone (database function). After multiple writes, the data is written into the kernel space in a centralized write (System Call.
Fgets, puts, fread, and fwrite in buffered I/O are called and called relationships with read and write in unbufferedI/O.
The following is an example of using buffered I/O to read data:

[Copy to clipboard] [-] CODE: # include <stdlib. h>
# Include <stdio. h>
# Include <sys/types. h>
# Include <sys/stat. h>
# Include <fcntl. h>
Int main (void)
{
Char buf [5];
FILE * myfile = stdin;
Fgets (buf, 5, myfile );
Fputs (buf, myfile );

Return 0;
}
What does "buffer" in buffered I/O mean?
Where is this buffer?
What is FILE? How is the space allocated?

To solve these problems, we need to see how FILE is defined and operated.
(In particular, when writing a program, you do not have to worry about how FILE is defined and operated. It is best not to operate it directly.
It is used here, just to illustrate buffered Io)
The following is the file definition provided by glibc, Which is implementation-related and has different definitions on other platforms.
Struct _ io_file {
Int _ flags;/* High-Order word is _ io_magic; rest is flags .*/
# DEFINE _ io_file_flags _ flags
/* The following pointers correspond to the C ++ streambuf protocol .*/
/* Note: TK uses the _ io_read_ptr and _ io_read_end fields directly .*/
Char * _ io_read_ptr;/* Current read pointer */
Char * _ io_read_end;/* end of get area .*/
Char * _ io_read_base;/* Start of putback + Get area .*/
Char * _ io_write_base;/* Start of put area .*/
Char * _ io_write_ptr;/* Current put pointer .*/
Char * _ io_write_end;/* end of put area .*/
Char * _ io_buf_base;/* Start of reserve area .*/
Char * _ io_buf_end;/* end of reserve area .*/
/* The following fields are used to support backing up and undo .*/
Char * _ IO_save_base;/* Pointer to start of non-current get area .*/
Char * _ IO_backup_base;/* Pointer to first valid character of backup area */
Char * _ IO_save_end;/* Pointer to end of non-current get area .*/
Struct _ IO_marker * _ markers;
Struct _ IO_FILE * _ chain;
Int _ fileno;
};
The preceding definition contains three important fields:
1.
Char * _ IO_read_ptr;/* Current read pointer */
Char * _ IO_read_end;/* End of get area .*/
Char * _ IO_read_base;/* Start of putback + get area .*/
2.
Char * _ IO_write_base;/* Start of put area .*/
Char * _ IO_write_ptr;/* Current put pointer .*/
Char * _ IO_write_end;/* End of put area .*/
3.
Char * _ io_buf_base;/* Start of reserve area .*/
Char * _ io_buf_end;/* end of reserve area .*/
Where,
_ Io_read_base points to "read buffer"
_ Io_read_end points to the end of "read buffer"
_ Io_read_end-_ io_read_base "read buffer" Length

_ Io_write_base points to "write buffer"
_ Io_write_end points to the end of "write buffer"
_ Io_write_end-_ io_write_base "write buffer" Length
_ Io_buf_base points to "buffer"
_ Io_buf_end points to the end of "buffer"
_ Io_buf_end-_ io_buf_base "buffer" Length
The above definition seems to provide three buffers. In fact, the above _ io_read_base,
_ Io_write_base and _ io_buf_base all point to the same buffer zone.
This buffer has nothing to do with Char Buf [5] in the above program.
During the first buffered Io operation, the compiler applied for space and was finally released by the compiler.
(Again, here is only the implementation of glibc. Other implementations may be different and will not be emphasized later)

Please refer to the following Program (here is an example of stdin, line buffer ):

[Copy to clipboard] [-] Code: # include <stdlib. h>
# Include <stdio. h>
# Include <sys/types. h>
# Include <sys/STAT. h>
# Include <fcntl. h>
Int main (void)
{
Char Buf [5];
File * myfile = stdin;
Printf ("before reading \ n ");
Printf ("read buffer base % P \ n", myfile-> _ io_read_base );
Printf ("read buffer length % d \ n", myfile-> _ io_read_end-myfile-> _ io_read_base );
Printf ("write buffer base % P \ n", myfile-> _ io_write_base );
Printf ("write buffer length % d \ n", myfile-> _ io_write_end-myfile-> _ io_write_base );
Printf ("Buf buffer base % P \ n", myfile-> _ io_buf_base );
Printf ("Buf buffer length % d \ n", myfile-> _ io_buf_end-myfile-> _ io_buf_base );
Printf ("\ n ");
Fgets (buf, 5, myfile );
Fputs (buf, myfile );
Printf ("\ n ");
Printf ("after reading \ n ");
Printf ("read buffer base % p \ n", myfile-> _ IO_read_base );
Printf ("read buffer length % d \ n", myfile-> _ IO_read_end-myfile-> _ IO_read_base );
Printf ("write buffer base % p \ n", myfile-> _ IO_write_base );
Printf ("write buffer length % d \ n", myfile-> _ IO_write_end-myfile-> _ IO_write_base );
Printf ("buf buffer base % p \ n", myfile-> _ IO_buf_base );
Printf ("buf buffer length % d \ n", myfile-> _ IO_buf_end-myfile-> _ IO_buf_base );
Return 0;
}
We can see that before reading, the buffer of myfile is not allocated. After one read, the buffer of myfile is allocated.
This buffer is neither a buffer in the kernel nor a buffer allocated by the user, but a buffer in the user process space maintained by the compiler.
(Of course, you can maintain the buffer zone. We will not discuss it here)

The above example only describes the existence of the buffered I/O buffer. The following describes buffered I/O from three aspects: Full buffer, row buffer, and no buffer.
How it works.

1.1. Full Buffer
The following is the original statement on apue:
Full buffer "actual I/O operations are performed only after the standard I/O buffer is filled. For Files residing on the disk, the standard I/O library usually implements full buffer"
In the book, "actual I/O operations" are easy to mislead. This is not a read/write disk, but a read or write System Call.
The following two examples illustrate this problem.

[Copy to clipboard] [-] Code: # include <stdlib. h>
# Include <stdio. h>
# Include <sys/types. h>
# Include <sys/STAT. h>
# Include <fcntl. h>
Int main (void)
{
Char Buf [5];
Char * cur;
File * myfile;
Myfile = fopen ("bbb.txt", "R ");
Printf ("before reading, myfile-> _ io_read_ptr: % d \ n", myfile-> _ io_read_ptr-myfile-> _ io_read_base );
Fgets (BUF, 5, myfile); // read only four characters
Cur = myfile-> _ io_read_base;
While (cur <myfile-> _ io_read_end) // the buffer is fully read.
{
Printf ("% C", * cur );
Cur ++;
}
Printf ("\ nafter reading, myfile-> _ io_read_ptr: % d \ n", myfile-> _ io_read_ptr-myfile-> _ io_read_base );
Return 0;
}
The content of the bbb.txt file is composed of many "123456789" lines.
In the above example, fgets (BUF, 5, myfile); only four characters are read, but the buffer is full,
But _ io_read_ptr moves forward by five digits. The next time you call the read operation,
As long as the number of digits to be read does not exceed myfile-> _ io_read_end-myfile-> _ io_read_ptr
You do not need to call the system to call read again. You only need to copy the data from the buffer zone of myfile
Buf (copy from myfile-> _ io_read_ptr)

When fully buffered reading,
_ Io_read_base always points to the beginning of the buffer
_ Io_read_end always points to the next character that has been read from the kernel into the buffer zone
(For full buffering, buffered I/O reads attempt to fully read the buffer every time)
_ Io_read_ptr always points to the next character that has been read by the user in the buffer
(_ Io_read_end <(_ io_buf_base-_ io_buf_end) & (_ io_read_ptr = _ io_read_end) The end of the file is reached.
_ Io_buf_base-_ io_buf_end indicates the buffer length.
The general scenario is:
When fgets (or other) is called for the first time, standard I/O will call read to fill the buffer. The next time fgets does not call read, it will directly copy data from the buffer
When the remaining data in the buffer is insufficient, call read again. In this process, _ io_read_ptr is used to record which data in the buffer is read,
Which data is unread.

[Copy to clipboard] [-] Code: # include <stdlib. h>
# Include <stdio. h>
# Include <sys/types. h>
# Include <sys/STAT. h>
# Include <fcntl. h>
Int main (void)
{
Char Buf [2048] = {0 };
Int I;
File * myfile;

Myfile = fopen ("aaa.txt", "R + ");

I = 0;
While (I <2048)
{
Fwrite (BUF + I, 1,512, myfile );
I + = 512;
// Write this sentence into aaa.txt
Myfile-> _ io_write_ptr = myfile-> _ io_write_base;
Printf ("% P write buffer base \ n", myfile-> _ io_write_base );
Printf ("% P Buf buffer base \ n", myfile-> _ io_buf_base );
Printf ("% P read buffer base \ n", myfile-> _ io_read_base );
Printf ("% P write buffer PTR \ n", myfile-> _ io_write_ptr );
Printf ("\ n ");
}
Return 0;
}
The above is an example of full buffer write.
Full buffer only when standard I/O is automatically flush (for example, when the buffer is full) or fflush is manually called,
Standard I/O calls a write System Call.
In the example, fwrite (buf + I, 1,512, myfile); this statement only takes the next 512 bytes of buf + I
Write buffer, because the buffer is not full, the standard I/O does not call write.
In this case, myfile-> _ IO_write_ptr = myfile-> _ IO_write_base; will cause the standard I/O to think
No data is written into the buffer zone, so writewill never be used. In this case, the aaa.txt file cannot be written.
Comment out myfile-> _ IO_write_ptr = myfile-> _ IO_write_base. Check the effect.

Full buffer write:
_ IO_write_base always points to the beginning of the buffer
_ IO_write_end always points to the next character of the last Buffer
(For full buffering, buffered I/O writes always try to call write after the buffer is full)
_ IO_write_ptr always points to the next character written by the user in the buffer
During flush, write the characters between _ IO_write_base and _ IO_write_ptr to the kernel through the system call write.

1.2. Row Buffering
The following is the original statement on APUE:
Row buffer "when a line break is encountered in the input and output, the standard I/O Library performs the I/O operation ."
In the book, "performing O operations" is also misleading. This is not a read/write disk, but a system call for read or write operations.

The following two examples illustrate this problem.
The first example can be used to illustrate the problems in the following post.

Http://bbs.chinaunix.net/viewthread.php? Tid = 954547

[Copy to clipboard] [-] Code: # include <stdlib. h>
# Include <stdio. h>
Int main (void)
{
Char Buf [5];
Char buf2 [10];

Fgets (BUF, 5, stdin); // The first input contains more than 5 Characters
Puts (stdin-> _ io_read_ptr); // This sentence indicates that the entire row is read into the buffer at a time, not just the characters needed above.

Stdin-> _ io_read_ptr = stdin-> _ io_read_end; // The standard I/O considers that the buffer zone is empty and calls read again.
// Comment out and check the effect

Printf ("\ n ");
Puts (BUF );

Fgets (buf2, 10, stdin );
Puts (buf2 );

Return 0;
}
In the above example, fgets (BUF, 5, stdin); only four characters are required. However, other data in the input row is also written to the buffer zone,
But _ io_read_ptr moves forward by five digits. The next time you call the fgets operation, you do not need to call the system to call read again,
You only need to copy data from the buffer zone of stdin to buf2 (from stdin-> _ io_read_ptr)
Stdin-> _ io_read_ptr = stdin-> _ io_read_end; will cause the standard I/O to think that the buffer is empty,
If fgets is used again, you need to call read again to compare the effect before and after commenting out the sentence.

When the row is buffered for read,
_ Io_read_base always points to the beginning of the buffer
_ Io_read_end always points to the next character that has been read from the kernel into the buffer zone
_ Io_read_ptr always points to the next character that has been read by the user in the buffer
(_ Io_read_end <(_ io_buf_base-_ io_buf_end) & (_ io_read_ptr = _ io_read_end) The end of the file is reached.
_ Io_buf_base-_ io_buf_end indicates the buffer length.

[Copy to clipboard] [-] Code: # include <stdlib. h>
# Include <stdio. h>
# Include <sys/types. h>
# Include <sys/STAT. h>
# Include <fcntl. h>
Char Buf [5] = {'1', '2', '3', '4', '5'}; // The last one should not be \ n, yes \ n, standard I/O will be flush automatically
// This is an important difference between row buffering and full buffering.
Void writelog (File * ftmp)
{
Fprintf (ftmp, "% P write buffer base \ n", stdout-> _ io_write_base );
Fprintf (ftmp, "% P Buf buffer base \ n", stdout-> _ io_buf_base );
Fprintf (ftmp, "% P read buffer base \ n", stdout-> _ io_read_base );
Fprintf (ftmp, "% P write buffer PTR \ n", stdout-> _ io_write_ptr );
Fprintf (ftmp, "\ n ");
}
Int main (void)
{
Int I;
File * ftmp;
Ftmp = fopen ("ccc.txt", "W ");

I = 0;
While (I <4)
{
Fwrite (BUF, 1, 5, stdout );
I ++;
* Stdout-> _ io_write_ptr ++ = '\ n'; // you can open this sentence separately to see the effect.
// Getchar (); // getchar () will output the buffer according to standard I/O
// Open the following comment and you will find that there is no output on the screen
// Stdout-> _ io_write_ptr = stdout-> _ io_write_base;
Writelog (ftmp); // This is only for viewing the changes in the buffer pointer

}
Return 0;
}
This example will write the variable file ccc.txt in the filestructure
If you are interested in the operation, you can check it out.
The above is an example of row buffer writing.
Stdout-> _ io_write_ptr = stdout-> _ io_write_base; will make the standard I/O think
The buffer is empty, so there is no output.
You can remove the comments in the above program to see the running results.
When the row is buffered, one of the following three conditions will cause the buffer to be immediately flush
1. The buffer is full.
2. encounter a line break. For example, if you change Buf [4] in the above example to '\ n',
3. If you want to obtain data from the kernel again, for example, adding getchar () to the program above will cause immediate output.


When writing row buffering:
_ Io_write_base always points to the beginning of the buffer
_ Io_write_end always points to the beginning of the buffer
_ Io_write_ptr always points to the next character written by the user in the buffer
During flush, write the characters between _ io_write_base and _ io_write_ptr to the kernel through the system call write.
1.3. No Buffer
When there is no buffer, standard I/O does not buffer characters for storage. A typical example is stderr.
Here there is no buffer, not the buffer size is 0, in fact, there is still a buffer, the size is 1

[Copy to clipboard] [-] Code: # include <stdlib. h>
# Include <stdio. h>
# Include <sys/types. h>
# Include <sys/STAT. h>
# Include <fcntl. h>
Int main (void)
{
Fputs ("stderr", stderr );
Printf ("% d \ n", stderr-> _ io_buf_end-stderr-> _ io_buf_base );
Return 0;
}
Every read/write operation on a stream without buffering will cause system calls.
1.4 feof Problems
There have been countless posts on Cu discussing feof. Here we will look at it from the perspective of the buffer zone.
For an empty file, why should I read it first before using feof to determine that the file has reached the end?

[Copy to clipboard] [-] Code: # include <stdlib. h>
# Include <stdio. h>
# Include <sys/types. h>
# Include <sys/STAT. h>
# Include <fcntl. h>
Int main (void)
{
Char Buf [5];
Char buf2 [10];
Fgets (BUF, sizeof (BUF), stdin); // The input must be 4 characters long to see the effect
Puts (BUF );
// Comments the following two lines Alternately
// Stdin-> _ io_read_end = stdin-> _ io_read_ptr + sizeof (buf2)-2;
Stdin-> _ io_read_end = stdin-> _ io_read_ptr + sizeof (buf2)-1;

Fgets (buf2, sizeof (buf2), stdin );
Puts (buf2 );


If (feof (stdin ))
Printf ("End \ n ");
Return 0;
}
To run the above program, you must enter more than 4 characters and press Ctrl + d twice for the end (do not press Enter)
As shown in the preceding example
(_ Io_read_end <(_ io_buf_base-_ io_buf_end) & (_ io_read_ptr = _ io_read_end)
Standard I/O is considered to have reached the end of the file, feof (stdin) will be set
_ Io_buf_base-_ io_buf_end indicates the buffer length.
That is to say, the standard I/O uses its buffer to determine whether the stream is over.
This explains why standard I/o needs to be read once even for an empty file before feof can be used to determine whether the release is empty.
1.5. Other Instructions
Many new users may misunderstand that fgets, fputs represent row buffering, fread, and fwrite represent fgetc as full buffering, and fputc indicates no buffering.
And so on.
In fact, this is not the case. What kind of buffering is irrelevant to the function used,
It has something to do with what type of files you read and write.
In the above example, fgets and fputs are used in the full buffer multiple times, while fread and fwrite are used in the row buffer.
Below is
Actually
Iso c Requirements:
1. They are fully buffered only when the standard input and standard output do not involve interactive devices.
2. The standard output is by no means a full buffer.
Many systems use the following types of standards by default:
1. The standard output is not buffered.
2. If other streams of the terminal device are involved, they are buffered; otherwise, they are fully buffered.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.