In-depth understanding of C language ---- standard I/O Summary (buffer zone, I/O function and other related issues)

Source: Internet
Author: User

Unlike file I/O operations on file descriptors, standard I/O operations are performed on streams.

Stream:

For the stream, there is a good explanation in "C and pointer:

Ansi c further abstracts the concept of I/O. For C Programs, All I/O operations are just a simple task of moving in or out bytes from the program. Therefore, it is not surprising that such a byte stream is called a stream ). The program only needs to care about creating correct output byte data and correctly interpreting the byte data read from the input. The details of specific I/O devices are hidden from programmers.

TCPL Appendix B .1 explains this:

A stream is a source or destination of data that may be associated with a disk or other peripheral. (A stream is the source or destination of data that may be associated with a hard disk or other devices)

Simply put, a stream is an abstraction of information. When processing files (text files and binary files), the C system does not differentiate the types and treats them as byte streams.
The start and end of the input and output streams are only controlled by the program, not by physical symbols (such as carriage returns.

The smallest information unit of a stream is the binary bit, and the smallest information package is the byte. The C standard library provides two types of streams: binary stream and text stream ). A binary stream is a sequence composed of unprocessed bytes. A text stream is a sequence composed of text lines (each line has 0 or more characters and ends with '\ n. Note that in UNIX, there is no difference between the two streams.


When a program starts, the standard input, output, and error streams are automatically opened and mapped to the default physical terminal. The three standard I/O circulation has been predefined (stdio. h) file pointers stdin, stdout, and stderr are referenced. When a process is terminated normally (exit () is called directly or returned from main) all open standard I/O streams are closed, and all I/O streams with unwritten buffer data are flushed.

PS: in main (), return (expr) is equivalent to exit (expr), while exit calls fclose () to close each file descriptor and cleanse the corresponding cache.

In Linux applications, file descriptors 0, 1 and 2 are usually used to associate with standard input, standard output, and standard error output. To comply with POSIX specifications , 0, 1, and 2 are replaced with the constant symbols STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO.


Buffer zone:

(For details, refer to APUE3.9 and 5.4. This section is an excerpt)

The purpose of the standard I/O cache is to minimize the number of read and write calls (the overhead of system calls is higher than that of common function calls ). It also automatically manages the cache for each I/O Stream, avoiding the trouble that the application needs to consider this.

Standard I/O provides three types of cache:
(1) Full cache. In this case, the actual I/O operation is performed only after the standard I/O cache is filled. Files stored on disks are usually fully cached by the standard I/O library.
(2) Row cache. In this case, when a new line character is encountered in the input and output, the standard I/O Library performs the I/O operation. This allows us to output one character at a time (using the standard I/O fputc function), but the actual I/O operation is performed only after a row is written.
(3) No cache. The standard I/O Library does not cache characters. If you use the standard I/O function to write a few characters to a stream without caching, it is equivalent to using the write System to call the function to write these characters to the associated open file.


The standard error stream stderr is usually not cached, so that the error information can be displayed as soon as possible, regardless of whether they contain a new line character.
Ansi c requires the following cache features:
(1) When and only when the standard input and standard output do not involve interaction devices, they are fully cached.
(2) standard errors will never be full-Cache

However, this does not tell us if the standard input and output involve interaction devices, whether they do not contain the cache or the row cache, and whether the standard output does not contain the cache, or row cache.
SVR4 and 4.3 + BSD use the following types of cache by default:
? Standard errors are not cached.
? If other streams of the terminal device are involved, they are row-cached; otherwise, they are all cached.


You can use the following function to change the cache type (APUE5.4 ):

void setbuf(FILE *restrict fp, char *restrict buf);int setvbuf(FILE *restrict fp, char *restrict buf, int mode, size_t size);
These functions must be called after the stream is opened but no operation is performed on the stream.

The buf parameter usually points to a buffer with the length of BUFSIZ. The BUFSIZ is defined in stdio. h and can be output and viewed on its own.

stdio.h:#ifndef BUFSIZ# define BUFSIZ _IO_BUFSIZlibio.h:#define _IO_BUFSIZ _G_BUFSIZ_G_config.h:#define _G_BUFSIZ 8192
To disable the buffer, set buf to NULL.

The default condition on Liunx is that when the standard input and output are connected to the terminal, the row is buffered, And the buffer size is 1024 bytes, Which is redirected to a common file, they become full buffering (APUE 5.12 program 5.3 provides a way to view I/O-related information)


Forcibly fl a stream

Int fflush (FILE * fp)

Send all unwritten data of the stream to the kernel. If fp is NULL, all output streams are cleaned.

Note that fflush (NULL) cannot effectively clear the input cache. Further details


Common I/O functions: stream opening:

FILE * fopen ( const char * filename, const char * mode );
FILE * freopen ( const char * filename, const char * mode, FILE * stream );
Use freopen for input/output redirection.

Single-character read/write:

int getc(FILE *fp)int fgetc(FILE *fp)int getchar(void)
Getchar is equivalent to getc (stdin ). The difference between the first two functions is that getc can be implemented as a macro, meaning that the call time of fgetc is slightly longer.

Whether it is an error or the end of the file, all three functions return-1.

In most implementations, FILE maintains two marks: the error mark and the end mark of the FILE. The following three functions can be used to determine whether a stream is faulty or terminated. The last function clears two marks:

int ferror(FILE *fp)int foef(FILE *fp)void clearerr(FILE *fp)

There is also a magic function that can push the character back:
int ungetc(int c, FILE *fp)
Note that EOF cannot be sent back

Similarly, the output function:

int putc(int c, FILE *fp)int fputc(int c, FILE *fp)int putchar(int c)
Putchar (c) is equivalent to putc (c, stdout). putc can be implemented as a macro.

To avoid excessive function call overhead, both putchar and getchar are implemented as macros.


Row read/write functions:

char * fgets ( char * str, int num, FILE * stream );
It reads no more than num-1 characters and adds the terminator '\ 0' at the end, or the line break is passed in when a line break ends.

Another function:

Char * gets (char * buf );

It is not recommended because of the buffer overflow vulnerability.

Correspondingly, the output

int fputs(const char * str, FILE * fp);int puts(const char *char);
Fputs () writes a string terminated with NULL to the specified stream. The Terminator NULL is not written.
Although puts () is secure, it is inconvenient to write line breaks to the output every time.

Therefore, we stick to this policy, stick to fgets and fputs, and handle line breaks by ourselves.


For more information about the efficiency of each I/O function, see APUE5.9.


Format I/O:

Formatted output:

int printf ( const char * format, ... );int fprintf (FILE *fp, const char * format, ... );int snprintf (char *buf, size_t n, const char * format, ... );

Note: In the parameter conversion description, % [flags] [fldwidth] [lenmodifier] convtype. The width and precision fields can be set to *, and then an integer parameter is used to specify the value.

Printf implementation:

Printf is one of the few variable parameter functions in C. It mainly processes the parameter list through a series of Macros in stdarg. h. For the source code implementation, see: Click to open the link and click to open the link.

Use the Variable Parameter Function Mechanism to simulate printf

# Include
 
  
# Include
  
   
# Include
   
    
# Include
    
     
Void simon_printf (char * fmt ,...) {char buf [10]; char * p = fmt; char c_tmp, * s_tmp; int I _tmp; double f_tmp; va_list ap; va_start (ap, fmt); while (* p) {if (* p! = '%') {Putchar (* p ++); continue;} else {switch (* ++ p) {case 'D': {I _tmp = va_arg (ap, int); // sprintf (buf, "% d", I _tmp); // write (STDOUT_FILENO, buf, strlen (buf); printf ("% d ", I _tmp); break;} case 'F': // float is internally upgraded to double {f_tmp = va_arg (ap, double); printf ("% f", f_tmp ); break;} case 'C': // char is promoted internally to int {I _tmp = va_arg (ap, int); printf ("% c", I _tmp); break ;} case's ': {for (s_tmp = va_arg (ap, char *); * s_tmp; s_tmp ++) printf ("% c", * s_tmp); break ;}} p ++ ;}}va_end (ap) ;}int main () {int a = 1; float B = 2.0; char c = 'a '; char * str = {"test"}; simon_printf ("This is a test Message: \ n int: % d \ n float: % f \ n string: % s \ n char: % c \ n ", a, B, str, c); return 0 ;}
    
   
  
 


Format input:

int scanf(const char *format, ...);int fscanf(FILE *fp, const char *format, ...);int sscanf(const char *buf, const char *format, ...);

* Indicates suppression in scanf. If this input is not assigned to the corresponding variable, it is skipped.

Scanf () also has some regular usage: [] indicates the input character set. You can use a hyphen to indicate the range. scanf () continuously eats the characters in the set and puts them in the corresponding character array, it is not a character in the set until it is found. Character ^ can be used to describe the complement set. When the ^ character is placed as the first character of the scan set, it constitutes a complementary set of commands consisting of other characters.

Scanf () is generally not recommended for regular expressions, which are complex and error-prone. It is difficult for the compiler to perform syntax analysis, thus affecting the quality and execution efficiency of the target code.

About clearing the input buffer:

1) fflush (NULL)

Fflush is clearly defined, and the results are uncertain due to this usage.

If the given stream was open for writing (or if it was open for updating and the last I/o operation was an output operation) any unwritten data in its output buffer is written to the file.
If stream is a null pointer, all such streams are flushed.
In all other cases, the behavior depends on the specific library implementation. in some implementations, flushing a stream open for reading causes its input buffer to be cleared (but this is not portable expected behavior ).
The stream remains open after this call.
When a file is closed, either because of a call to fclose or because the program terminates, all the buffers associated with it are automatically flushed.

If the stream points to the output stream or the update stream, and the most recent operation of the update stream is not the input, then, the fflush function will write any unwritten data to the file pointed to by the stream (such as the standard output file stdout ).
Fflush (NULL) clears all output streams and the update streams mentioned above.
Otherwise, the behavior of the fflush function is uncertain. Depending on the compiler, Some compilers (such as VC6) Support fflush (stdin) to clear the input buffer, which gcc does not support.

2) setbuf (stdin, NULL );
Setbuf (stdin, NULL); converts the stdin input stream from the default buffer zone to a buffer zone without special requirements.


3) int c;
While (c = getchar ())! = '\ N' & c! = EOF );
The Code keeps using getchar () to get the characters in the buffer until the obtained character c is the linefeed '\ n' or the file Terminator EOF. This method perfectly clears the input buffer and is portable.


4) scanf ("% [^ \ n] % * c ");
Here, "*" in the scanf formatting character is used, that is, the value assignment is blocked; "% [^ set]" is used to match any character sequence that is not in the set. This also brings about a problem. The linefeed '\ n' in the buffer will stay, and additional operations are required to discard the linefeed separately.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.