Linux standard input and output

Source: Internet
Author: User
Tags goto stdin

A Brief introduction

Sdtin, stdout, stderr are referred to as standard inputs, standard outputs, and standard error outputs, which are declared as follows:

/* Standard streams. */
extern FILE *stdin;  /* standard input stream. */
extern FILE *stdout;  /* Standard output stream. */
extern FILE *stderr;  /* standard error output stream. */

You can see that it is a pointer variable defined by LIBC, but c89/c99 rules that it should be a macro, so there is the following paragraph:

/* C89/c99 say they ' re macros.  Make them happy. */
#define STDIN stdin
#define STDOUT stdout
#define STDERR stderr

Many times the application IO operation does not specify a file handle for the operation, such as Printf,puts,getchar (), scanf (), etc., then the standard input output is used to see the implementation of the printf () function:

int printf (const char * __restrict format, ...)
{
Va_list Arg;
int RV;


Va_start (ARG, format);
RV = vfprintf (stdout, format, ARG);
Va_end (ARG);


return RV;
}

printf output to the STDOUT, which is well understood.

Two principles

    • Initialization process

Sdtin, stdout, stderr is where to initialize it, it is not difficult to find the following code:

FILE *stdin = _stdio_streams;
FILE *stdout = _stdio_streams + 1;
FILE *stderr = _stdio_streams + 2;

That is, their values are specified at compile time, do not need to be set at runtime, and continue to see the definition of _stdio_streams:

Static FILE _stdio_streams[] = {
__stdio_init_file_struct (_stdio_streams[0], \
__flag_lbf|__flag_readonly, \
0, \
_stdio_streams + 1, \
_fixed_buffers, \
Bufsiz),
__stdio_init_file_struct (_stdio_streams[1], \
__flag_lbf|__flag_writeonly, \
1, \
_stdio_streams + 2, \
_fixed_buffers + Bufsiz, \
Bufsiz),
__stdio_init_file_struct (_stdio_streams[2], \
__flag_nbf|__flag_writeonly, \
2, \
NULL, \
NULL, \
0)
};

Especially note is the 0,1,2 file descriptor, which is a struct type, defined as follows:

struct __stdio_file_struct {
unsigned short __modeflags;
/* There could be a hole here, but Modeflags is used most.*/
unsigned char __ungot[2];
int __filedes;
#ifdef __stdio_buffers
unsigned char *__bufstart;/* pointer to buffer */
unsigned char *__bufend;/* pointer to 1 past end of buffer */
unsigned char *__bufpos;
unsigned char *__bufread; /* Pointer to 1 past last buffered read char */

#ifdef __STDIO_GETC_MACRO
unsigned char *__bufgetc_u;/* 1 past last readable by getc_unlocked */
#endif/* __stdio_getc_macro */
#ifdef __STDIO_PUTC_MACRO
unsigned char *__bufputc_u;/* 1 past last writeable by putc_unlocked */
#endif/* __stdio_putc_macro */

#endif/* __stdio_buffers */

......................
#if __stdio_builtin_buf_size > 0
unsigned char __builtinbuf[__stdio_builtin_buf_size];
#endif/* __stdio_builtin_buf_size > 0 * *
};

It can be seen that the buffer of _stdio_streams is fixed:

#ifdef __stdio_buffers
static unsigned char _fixed_buffers[2 * bufsiz];
#endif

Bufsiz The default size is 4096, but for files that later fopen Open, the buffers are allocated by malloc.

Where is the 0,1,2 file description open? In general, the parent process is inherited, which makes it easy to implement redirection and pipeline operations, the parent process first saves the 0,1,2 file descriptor, then the DUP 0,1,2, initiates the child process, and then the parent process restores the saved 0,1,2 file descriptor, and of course the libc starts at 0, 1, 2 file descriptors are checked:

__CHECK_ONE_FD (Stdin_fileno, O_rdonly | O_nofollow);
__CHECK_ONE_FD (Stdout_fileno, O_RDWR | O_nofollow);
__CHECK_ONE_FD (Stderr_fileno, O_RDWR | O_nofollow);

where __CHECK_ONE_FD () is defined as:

static void __check_one_fd (int fd, int mode)
{
/* Check If the specified FD is already open */
if (Fcntl (fd, f_getfd) = =-1)
{
/* The descriptor is probably no open, so try to use/dev/null */
int NULLFD = open (_path_devnull, mode);
/*/dev/null is major=1 minor=3. Make absolutely certain
* That's in fact the device which we have opened and not
* Some other wierd file ... [Removed in UCLIBC] */
if (NULLFD!=FD)
{
Abort ();
}
}
}

When Discovery 0,1,2 is not open, open/dev/null as 0,1,2

In addition, LIBC calls _stdio_init () to run-time initialization of _stdio_streams because some of these parameters cannot be specified by the compiler, such as the buffer type:

void Attribute_hidden _stdio_init (void)
{
#ifdef __stdio_buffers
int old_errno = errno;
/* stdin and STDOUT uses line buffering when connected to a TTY. */
if (!isatty (0))
_stdio_streams[0].__modeflags ^= __flag_lbf;
if (!isatty (1))
_stdio_streams[1].__modeflags ^= __flag_lbf;
__set_errno (Old_errno);
#endif
#ifndef __uclibc__
/* _stdio_term is do automatically when exiting if Stdio is used.
* See MISC/INTERNALS/__UCLIBC_MAIN.C and and stdlib/atexit.c. */
Atexit (_stdio_term);
#endif
}

To determine whether a TTY is the type of buffer, the Isatty judgment is based on the IOCTL (FD, Tcgets, &k_termios), since each TTY corresponds to a termios for the line disc configuration.

    • Type of buffering

#define __FLAG_FBF 0X0000U/* Must be 0 */
#define __FLAG_LBF 0X0100U
#define __FLAG_NBF 0x0200u/* (__flag_lbf << 1) */

For full buffer, row buffer (line buffer) and unbuffered (no buffer), the full buffer means: only when the buffer is full or there is not enough space, the real read and write operations, common general normal files. Row buffering: Read and write with a basic unit of behavior, common TTY devices. No buffering: Do not buffer, read and write directly, common stderr, need to output the error immediately visible.

    • and open () the difference between

is encapsulated only on the basis of the open () system call, with buffer management in the middle, and ultimately through system calls for real read and write operations. The advantage is that most users read and write operations are buffered directly, because the system call execution is slow, to minimize the frequency of system calls, can greatly improve the efficiency of program execution.

    • Buffered management

Read and write are in a single character unit, the following analysis of the read and write process buffer management.
Read operation:

if (__stdio_stream_buffer_ravail (STREAM)) {/* has buffered? */
Return __stdio_stream_buffer_get (STREAM);
}

If read available in buffer is read, the characters in buffer are directly returned, otherwise the readable data in buffer is null:

if (__stdio_stream_buffer_size (STREAM)) {/* do we have a BUFFER? */
__STDIO_STREAM_DISABLE_GETC (STREAM);
if (__stdio_fill_read_buffer (stream)) {/* refill succeeded? */
__STDIO_STREAM_ENABLE_GETC (STREAM);/* FBF or LBF */
Return __stdio_stream_buffer_get (STREAM);
}
} else {
unsigned char UC;
if (__stdio_read (stream, &UC, 1)) {
return UC;
}
}

Call __stdio_fill_read_buffer () to fill BUFFER

#define __stdio_fill_read_buffer (s) __stdio_rfill ((s))

size_t __stdio_rfill (register FILE *__restrict Stream)

{

.......

RV = __stdio_read (stream, Stream->__bufstart,
Stream->__bufend-stream->__bufstart);
Stream->__bufpos = stream->__bufstart;
Stream->__bufread = Stream->__bufstart + rv;

}

For full buffering, fill the entire buffer as much as possible, and for row buffers, read a row of data, and as for the TTY how to read a row of data, do not expand here. After the user continuously reads the data Stream->__bufpos continuously moves backwards, when equals stream->__bufread indicates the buffer reads empty, then calls this function to fill.

Write operation:

if (__stdio_stream_buffer_size (STREAM)) {/* do we have a BUFFER? */
/* The buffer is full and/or the stream was line buffered. */
if (STREAM)//BUFFER full? */!__stdio_stream_buffer_wavail
&& __stdio_commit_write_buffer (Stream)/* COMMIT failed! */
) {
Goto bad;
}

__stdio_stream_buffer_add (STREAM, ((unsigned char) c));

if (__stdio_stream_is_lbf (STREAM)) {
if ((((unsigned char) c) = = ' \ n ')
&& __stdio_commit_write_buffer (stream)) {
/* Commit failed! */
__stdio_stream_buffer_unadd (STREAM); /* Undo the write! */
Goto bad;
}

Determine if buffer is still a sufficient space to write before writing a single character, and if not, commit the write system call to empty the buffer. When there is enough space, write to buffer, finally determine if it is a row buffer, and there is a line end flag, if it is to commit the write system call, for the full buffer without the tube, as far as possible delay the write operation, until the next time there is not enough space to write to commit the write system call. The __stdio_commit_write_buffer process is as follows:

if ((bufsize = __stdio_stream_buffer_wused (STREAM))! = 0) {
Stream->__bufpos = stream->__bufstart;
__stdio_write (Stream, Stream->__bufstart, bufsize);
}

Stream->__bufpos is the position of the current write buffer, which equals stream->__bufstart after submission, indicating that the buffer is emptied.

    • What is Ungot?

void Scanf_buffer (void)

{

int A, B;

while (scanf ("%d%d", &a,&b)! = EOF)

printf ("%d%d\n", A, b);

}

This is a very common usage, normally no problem, but if the user mistakenly input, such as input csdn 666666\n What will happen, curious can run the experiment, the result is the cycle of death, why would die cycle, which is related to the implementation of scanf (), scanf from the buffer to remove a character,%d indicates that the need is a number, the result is not correct, and the word Fusse back, scanf function error returned, the contents of the buffer is still csdn 666666\n, so the next time in because the buffer has data, The direct error is returned without waiting for user input, so a dead loop occurs.

This is the ungot mechanism, which returns the character to the buffer when the scanf () remove character is found to be incorrect. Alternatively, the user can call the UNGETC () function push back to a single character to the buffer.

Here is a comment in libc, which shows one or two:
/***********************************************************************/
/* have ungotten characters implies the stream is reading.
* The scheme used here treats the least significant 2 bits of
* The stream ' s Modeflags member as follows:
* 0 0 not currently reading.
* 0 1 Reading, but no ungetc () or scanf () push back chars.
* 1 0 Reading with one ungetc () char (ungot[1] is 1)
* or one scanf () pushed back char (ungot[1] is 0).
* 1 1 Reading with both an ungetc () char and a scanf ()
* Pushed back char. Note that this must is the result
* of a scanf () push back (in ungot[0]) _followed_ by
* an UNGETC () call (in Ungot[1]).
*
* Notes:
* SCANF () can not use UNGETC () to push the back characters.
* (see section 7.19.6.2 of the c9x rationale-wg14/n897.)
*/

if (__stdio_stream_can_use_buffer_get (STREAM)
&& (c! = EOF)
&& (Stream->__bufpos > Stream->__bufstart)
&& (stream->__bufpos[-1] = = ((unsigned char) c))
) {
--stream->__bufpos;
__stdio_stream_clear_eof (STREAM); /* Must clear end-of-file flag. */
}

else if (c! = EOF) {
__STDIO_STREAM_DISABLE_GETC (STREAM);


/* Flag this as a user ungot, as scanf does the necessary fixup. */
STREAM->__UNGOT[1] = 1;
stream->__ungot[(++stream->__modeflags) & 1] = C;


__stdio_stream_clear_eof (STREAM); /* Must clear end-of-file flag. */
}

If the push back character is just read, then the direct Stream->__bufpos minus one, for a lot of use of getc ()/ungetc (), can significantly improve the efficiency of operation, but if the push back is not the last read from the buffer, Instead, the user calls ungetc () push back one other character, then go to the following process, __STDIO_STREAM_DISABLE_GETC (STREAM) Set the next GETC () to read from the Ungot slot first, Ungot Slot refers to here stream->__ungot[2], then you can continuously push back how many characters, theoretically only one, because scanf only need one, but according to the implementation of the code here, can have a number of:

Meaning of the stream->__modeflags expression:

High <----------------------------------------------------32bit----------------------------------------------3 --------2---------1---------0> Low

Error EOF Ungot Reading

0 0 1: Indicates reading, no Ungot

Push back one character and change to:

0 1 0:

STREAM->__UNGOT[1] = 1 means stream->__ungot[0] The character that holds the ungetc () push back

Stream->__ungot[1] = 0 means stream->__ungot[0] The character that holds the scanf () push back

Then, push back one character, and then change to:

0 1 1:

Stream->__ungot[0] The character that holds the scanf () push back

STREAM->__UNGOT[1] The character that holds the ungetc () push back

As you can see, it's OK to push back two characters in a row, but what happens if you push back one character at a later point? The value becomes as follows:

1 0 0:

Stream->__ungot[0] holds the UNGETC () push back character, overwrites the previous push back character, and the __FLAG_UNGOT flag is cleared, the call to Getc () is a character that cannot be read by the push , part of the code for the GETC () function is as follows:

if (Stream->__modeflags & __flag_ungot) {/* use Ungots first. */
unsigned char UC = stream->__ungot[(stream->__modeflags--) & 1];
STREAM->__UNGOT[1] = 0;
__stdio_stream_validate (STREAM);
return UC;
}

__stdio_stream_clear_eof (STREAM) The last call to clear the EOF flag, so if the continuous push back multiple characters, does not cause a buffer overflow or panic, but the push back character is missing, the program run logic may have problems, The number of consecutive ungetc () should not exceed 1 times for better portability of the program.

    • Lock protection

If the application is single-threaded, you can use the unlocked version of the interface directly, BusyBox is a typical example:

/* Busybox does not use threads, we can speed up stdio. */
#ifdef Have_unlocked_stdio
# undef GETC
# define GETC (stream) getc_unlocked (stream)
# undef GetChar
# define GETCHAR () getchar_unlocked ()
# undef PUTC
# define PUTC (c, Stream) putc_unlocked (c, Stream)
# undef Putchar
# define PUTCHAR (c) putchar_unlocked (c)
# undef FGETC
# define FGETC (stream) getc_unlocked (stream)
# undef FPUTC
# define FPUTC (c, Stream) putc_unlocked (c, Stream)
#endif
/* Above functions is required by posix.1-2008, below ones is extensions */
#ifdef Have_unlocked_line_ops
# undef Fgets
# define Fgets (s, N, Stream) fgets_unlocked (s, N, Stream)
# undef Fputs
# define Fputs (S, stream) fputs_unlocked (s, stream)
#endif

    • Read/write automatic conversion

If the stream->__modeflags does not have the ReadOnly or WriteOnly flag set, and the LIBC is configured to support read and write conversions, the read-write conversion does not require the programmer's concern, and if the libc does not support automatic read and write conversions, you need to be aware

/* C99:output shall not being directly followed by input without an
Intervening call to the Fflush function or to a file positioning
function (fseek, fsetpos, or rewind). */

Refer to _trans2r.c and _trans2w.c files for details.

    • Narrow & Wide Reading

Mainly related to wide characters, how to not support WCHAR, the default is narrow reading way, narrow in a single character unit, wide in two characters, it is important to note that once the stream is set, it cannot be changed unless close is reopened:

if (! ( Stream->__modeflags & Oflag)) {
if (Stream->__modeflags & (__flag_narrow|__flag_wide)) {
__undefined_or_nonportable;
Goto DO_EBADF;
}
Stream->__modeflags |= Oflag;
}

Three things to note

    • scanf usage

Here is a well-written blog, although some of the statements are not correct, but a large number of usage examples are worth borrowing from: http://blog.csdn.net/kobesdu/article/details/39051399, Special attention should be paid to the scanf caused by the death cycle, the above in the Ungot has been analyzed.

    • Fflush emptying the buffer

For the output, call Fflush immediately performs a write operation while emptying the buffer, but for the input, I see the libc version of the Fflush () function part of the code as follows:

if (__stdio_stream_is_writing (STREAM)) {
if (!__stdio_commit_write_buffer (stream)) {
__STDIO_STREAM_DISABLE_PUTC (STREAM);
__stdio_stream_clear_writing (STREAM);
} else {
retval = EOF;
}

__stdio_stream_is_writing () determines whether the stream is in a write operation, otherwise returns an error, so in order for the program to be portable, it is best not to use fflush to empty the input buffer, instead of using a different method.

Concluding remarks: This part of the content is too complex, limited energy, in order to save time, feel a lot of things are not clearly described, there is time to add to it.

http://blog.csdn.net/whuzm08/article/details/73793688

Linux standard input and output

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.