The I/O functions described in the previous article, "Advanced programming----file descriptors for UNIX environments" are for file descriptors. For standard I/O libraries, they operate around streams. When opening or creating a file with a standard I/O library, we have combined a stream with the file.
A, stream, and file object
When a stream is opened, the standard I/O function fopen Returns a pointer to the file object. This object is typically a structure that contains all the information that the I/O library needs to manage the flow: The file descriptor for the actual I/O, the pointer to the stream cache, the length of the cache, the number of characters currently in the cache, the error flag, and so on.
The application does not need to validate the file object. To reference a stream, you pass the file pointer as a parameter to each standard I/O function. In the Advanced Programming for UNIX environment, we refer to a pointer to a file object (type file*) as a document pointer.
Second, the standard I/O library cache (need to understand)
The purpose of the standard I/O cache is to use read and write changes as little as possible to speed up the reading and writing of files. Unfortunately, the most confusing of the standard I/O library is its cache. In order to explain the caching mechanism in detail, it is important to understand why this cache provides the operational efficiency of the file.
The user program calls the standard I/O library functions to read and write files, which pass the read and write requests to the kernel through system calls and ultimately the I/O operation by the kernel-driven disk or device. The standard I/O library allocates an I/O buffer for each open file to speed up read and write operations, which can be found through the file structure of the files, where the user invokes the read-write function Most of the time in the I/O buffer, and only a few times the read-write request is passed to the kernel. Taking FGETC/FPUTC as an example, when the user program first calls Fgetc to read a byte, the FGETC function may go through the system call into the kernel to read 1K bytes into the I/O buffer, and then return the first byte in the I/O buffer to the user, pointing the Read and write position to i/ o The second character in the buffer, after the user re-fgetc, directly from the I/O buffer read, and do not need to enter the kernel, when the user has read this 1K byte, call Fgetc again, the FGETC function will again enter the kernel to read 1K bytes into the I/O buffer. The standard I/O library will read some data from the kernel in the I/O buffer, it is hoped that the user program to use this data, the standard I/O library I/o buffer in the user space, directly from the user space to read the data is much faster than the kernel read data. On the other hand, the user program call FPUTC is usually just written into the I/O buffer, so that the FPUTC function can be returned quickly, if the I/O buffer is full, FPUTC through the system calls the I/O buffer data to the kernel, the kernel eventually writes the data back to disk. Sometimes the user program wants to send the data in the I/O buffer to the kernel immediately, let the kernel write back to the device, this is called flush operation, the corresponding library function is the Fflush,fclose function will also do flush operation before closing the file.
The fgets/fputs signals the role of I/O buffers, and in the user program when using the Fgets/fputs function, the buffers are also allocated (BUF1 and buf2 in the figure), and the buffer of the user program and the I/O buffers of the C standard library are distinguished.
Figure one I/O buffers
The standard I/O library provides three types of caching:
1) full cache : Write back to the kernel if the buffer is full. Regular files are usually fully buffered.
2) Row cache: If the user program writes a newline character in the data, write the line back to the kernel, or write back to the kernel if the buffer is full. Standard inputs and standard outputs are usually line-buffered when they correspond to terminal devices.
The row cache has two restrictions:
The first is the buffer length of the row buffer is fixed, the system generally defaults to 1 K, so as long as the row buffer is full, even if not write a new line break, the system will also perform I/O operations; This can be seen from the following example.
The second is that any time the input data is requested from (a) a stream without a cache, or (b) a row-cached stream (which requires data from the kernel beforehand), it will cause the flush of all rows to cache the output stream.
Example 01.c
#include <stdio.h>
int main ()
{
printf ("Hello World");
Whlie (1);
return 0;
}
Compile execution will find that the terminal has nothing to output. If you remove Whlie (1), you will print Hello World at the terminal.
Example 02.c
#include <stdio.h>
int main ()
{
printf ("Hello world\n");
Whlie (1);
return 0;
}
Compile execution will find that the terminal prints Hello world.
Example 03.c
#include <stdio.h>
int main ()
{
printf ("Hello world ... Hello World ")//... Represents a 1024-11*2 byte
Whlie (1);
return 0;
}
Compile execution will find the terminal print out Hello world ... Hello World. The above three examples suffice to state that the buffer length of the row cache type is fixed. The system performs I/O operations when the data written to the buffer is a newline character or longer than the buffer length.
3) without cache: The user program writes back to the kernel through the system call every time the library function is written. The standard error output is usually unbuffered so that error messages generated by the user program can be output to the device as soon as possible.
For any one stream, if we don't like these systems by default, you can change the cache type by calling one of the following two functions
--------------------------------------------------------------------------------------------------------------- --
void Setbuf (FILE *fp, char *buf);
Or
int setvbuf (FILE *fp, char *buf, int mode, size_t size);
Returns: 0 if successful, not 0 if an error occurs
--------------------------------------------------------------------------------------------------------------- --
Two is the setbuf and setvbuf function each option description, can obviously see function setvbuf function more powerful some.
Figure II Setbuf and SETVBUF functions each option description
Three, standard I/O library function 1. Open and close I/O stream functions
The following three functions can be used to open a standard flow:
--------------------------------------------------------------------------------------------------------------- -
FILE *fopen (const char *pathname, const char *type);
File *freopen (const char *pathname, const char *type, file *FP);
FILE *fdopen (int filedes, const char *type);
Three function returns: If successful, the file pointer, or null if an error occurs
-------------------------------------------------------------------------------------------------------------
The differences between the three functions are:
(1) fopen opens a file with path name indicated by pathname.
(2) Freopen on a specific stream (indicated by the FP) opens a specified file (its pathname is indicated by pathname), and if the stream is already open, close the stream first. This function is typically used to open a specified file as a predefined stream: standard input, standard output, or standard error.
(3) Fdopen takes an existing file descriptor (we may get this file descriptor from the OPEN,DUP,DUP2,FCNTL or pipe function) and combines a standard I/O stream with the descriptor.
The following function is used to close a standard flow:
--------------------------------------------------------------------------------------------------------------
int fclose (FILE *fp)
---------------------------------------------------------------------------------------------------------------
2. Read and write I/O stream functions
1) I/O functions in bytes
--------------------------------------------------------------------------------------------------------------- -
int getc (FILE *stream);
int fgetc (FILE *stream);
int GetChar (void);
Return value: The byte read was successfully returned, an error occurred, or EOF was returned at the end of the file
--------------------------------------------------------------------------------------------------------------- -------
The first and third itself is not a function, but is implemented by means of a macro definition with fgetc. Like what:
# define GETC (_stream) fgetc (_stream)
# define GetChar fgetc (stdin)
L so fgetc is allowed to be passed as one parameter to another function.
L FGETC return read to a byte when successful, it should be unsigned char, but since the return value in the function prototype is type int, so this byte is converted to int type and then returned, why should the return value be specified as int? Because FGETC will return EOF at the end of the file, or 1, the return value of the type int is 0xFFFFFFFF, and if read to byte 0xFF, the conversion from unsigned char to int is 0x000000ff, Only the specified return value is int to distinguish between the two cases, if the specified return value is unsigned char, then it is not possible to distinguish between EOF or byte 0xFF when the return value is 0xFF. If you need to save the return value of fgetc, be sure to save it in an int variable, if written as unsigned char c = fgetc (FP), then the value of C cannot differentiate between EOF and 0xFF bytes. Note that FGETC returns EOF at the end of the file, just using this return value to indicate that it has been read to the end of the file, not that a byte is EOF at the end of each file (based on the above analysis, EOF is not a byte).
---------------------------------------------------------------------------------------------------------------
int PUTC (int c, FILE *stream);
int FPUTC (int c, FILE *stream);
int Putchar (int c);
Return value: If C is successfully returned, the error is EOF
---------------------------------------------------------------------------------------------------------------
The same as the first and the third itself is not a function, is through the macro definition with the help of fgetc to achieve.
2) I/O functions in string units
--------------------------------------------------------------------------------------------------------------- -
Char *fgets (char *s, int size, FILE *stream);
Char *gets (char *s);
Return value: A pointer to which to return when it is successful, error, or null when reading to the end of the file
---------------------------------------------------------------------------------------------------------------
Both functions specify a cache address, in which the read-in string is placed. Gets is read from the standard input, fgets is read from the specified stream.
L Get does not recommend programmers to use, it exists just to be compatible with previous programs, we write code should not have to call this function.
L now say the fgets function, the parameter S is the first address of the buffer, and size is the length of the buffer, which reads a line at the end of ' \ n ' in the file referred to by the stream (including ' \ n ') to the buffer s, and adds a ' \ s ' to the end of the line to form the complete string. If a row in the file is too long, Fgets reads the size-1 characters from the file and does not read ' \ n ', the size-1 characters and a '/s ' character that have been read are stored in the buffer, and the remaining half lines of the file can continue to be read the next time the fgets is called. If a fgets call is read into a number of characters specifier reaches the end of the file, the read string is added to the buffer and returned, and if Fgets is called again, NULL is used to determine if the end of the file is read. Note that for fgets, ' \ n ' is a special character, and ' a ' is not anything special, and if you read '/' It is read as a normal character. If the file has the ' fgets ' character (or 0x00 byte), it is not possible to determine whether ' \ ' in the buffer is a character read from a file or a terminator that is automatically added by fgets, so fgets is suitable only for reading text files and not for reading binary files. And all characters in the text file should be visible characters and cannot have '. For binary files can be implemented by fread
---------------------------------------------------------------------------------------------
int fputs (const char *s, FILE *stream);
int puts (const char *s);
Return value: Successfully returns a non-negative integer with an error returning EOF
------------------------------------------------------------------------------------------------
The buffer s holds a string ending with ' fputs ', which writes the string to the file stream, but does not write the end of ' \ S '. Unlike fgets, fputs does not care about the ' \ n ' character in a string, and there can be ' \ n ' in the string without ' \ n '. Puts writes the string s to standard output (not including the end of ' + ') and then writes a ' \ n ' to the standard output automatically.
L
3) binary I/O functions
L also mentioned above that the IO function in the string unit is not suitable for binary text. Of course, for binary files, we can do this by using fgetc and FPUTC, but we have to loop the entire binary file, which is obviously less efficient. So the standard IO library provides the following two functions for binary file operations:
----------------------------------------------------------------------------------------------
size_t fread (void *ptr, size_t size, size_t nmemb, FILE *stream);
size_t fwrite (const void *ptr, size_t size, size_t nmemb, FILE *stream);
Return value: The number of records read or written, the number of records returned at success is equal to NMEMB, the number of records returned when an error or read to the end of the file is less than NMEMB, and 0 may be returned
--------------------------------------------------------------------------------------------------
The basic problem with binary I/O is that it can only be used to read data that has been written on the same system. The reasons are:
(1) In a structure, the displacement of the same member may vary depending on the compiler and the system (due to different alignment requirements). Indeed, some compilers have a choice that allows tightly packed structures (where storage space is saved while performance may be degraded) or precisely aligned so that members in the structure are easily accessible at run time. This means that even on a single system, binary storage of a structure may vary depending on the compiler's selection.
(2) The binary format used to store multibyte integers and floating-point values may be different between different system structures.
3) binary I/O functions
L also mentioned above that the IO function in the string unit is not suitable for binary text. Of course, for binary files, we can do this by using fgetc and FPUTC, but we have to loop the entire binary file, which is obviously less efficient. So the standard IO library provides the following two functions for binary file operations:
----------------------------------------------------------------------------------------------
size_t fread (void *ptr, size_t size, size_t nmemb, FILE *stream);
size_t fwrite (const void *ptr, size_t size, size_t nmemb, FILE *stream);
Return value: The number of records read or written, the number of records returned at success is equal to NMEMB, the number of records returned when an error or read to the end of the file is less than NMEMB, and 0 may be returned
-------------------------------------------------------------------------------------------------
The basic problem with binary I/O is that it can only be used to read data that has been written on the same system. The reasons are:
(1) In a structure, the displacement of the same member may vary depending on the compiler and the system (due to different alignment requirements). Indeed, some compilers have a choice that allows tightly packed structures (where storage space is saved while performance may be degraded) or precisely aligned so that members in the structure are easily accessible at run time. This means that even on a single system, binary storage of a structure may vary depending on the compiler's selection.
(2) The binary format used to store multibyte integers and floating-point values may be different between different system structures.
3. Locating I/O stream functions
Two methods for locating standard I/O streams:
(1) Ftell and fseek. These two functions exist since V7, but they all assume that the location of the file can be stored in a long integer type.
(2) Fgetpos and Fsetpos. These two functions are newly introduced by ANSI C. They introduce a new abstract data type, Fpost, which records the location of the file. In non-UNIX systems, this data type can be defined as the length required to record the location of a file. So applications that are ported to non-UNIX systems should use Fgetpos and Fsetpos.
----------------------------------------------------------------------------------------------------
int fseek (FILE *stream, long offset, int whence);
Return value: Successfully returned 0, error returned-1 and set errno
Long Ftell (FILE *stream);
Return value: Successfully returns the current read and write location, error 1 and set errno
void Rewind (FILE *stream);
Move the read-write location to the beginning of the file
-------------------------------------------------------------------------------------------------------
The fseek whence and offset parameters together determine where the read and write position is moved, and the whence parameter has the following meanings:
Seek_set
Move offset bytes from the beginning of a file
Seek_cur
Move offset bytes from the current position
Seek_end
To move offset bytes from the end of a file
Offset can be positively negative, negative values move forward (toward the beginning of the file), positive values move backwards (toward the end of the file), and if the number of bytes moved forward exceeds the beginning of the file, an error is returned, and if the number of bytes moved backwards exceeds the end of the file, the file size is increased when you write again. The bytes from the end of the original file to the read-write location after the fseek move are 0.
-------------------------------------------------------------------------------------------------------
int Fgetpos (Filef *p, fpos_t *pos);
int Fsetpos (Filef *p, const fpos_t *pos);
Two function returns: 0 if successful, not 0 if an error occurs
-----------------------------------------------------------------------------------------------------------
Fgetpos the current value of the file location indicator into the object pointed to by the Pos. You can use this value to reposition a stream to that location when you call Fsetpos later.
4. Formatting I/O stream functions
L Format Input Function:
----------------------------------------------------------------------------------------------------
int printf (const char *format, ...);
int fprintf (FILE *stream, const char *format, ...);
int sprintf (char *str, const char *format, ...);
int snprintf (char *str, size_t size, const char *format, ...);
int vprintf (const char *format, va_list AP);
int vfprintf (FILE *stream, const char *format, va_list AP);
int vsprintf (char *str, const char *format, va_list AP);
int vsnprintf (char *str, size_t size, const char *format, va_list AP);
Return value: The number of bytes (not including the end of the string) that successfully returned a formatted output, with an error returning a negative value
--------------------------------------------------------------------------------------------------------
L Format output function:
---------------------------------------------------------------------------------------------------------
int scanf (const char *format, ...);
int fscanf (FILE *stream, const char *format, ...);
int sscanf (const char *STR, const char *format, ...);
#include <stdarg.h>
int vscanf (const char *format, va_list AP);
int vsscanf (const char *STR, const char *format, va_list AP);
int vfscanf (FILE *stream, const char *format, va_list AP);
Return value: Returns the number of parameters that successfully matched and assigned, the parameter that successfully matched may be less than the supplied assignment parameter, return 0 for a mismatch, an error, or a read to the end of a file or a string to return EOF and set errno
---------------------------------------------------------------------------------------------------------
Here is just a little bit of printf, we add # after the%, print to the terminal value, will automatically add 0, 0x in front. For example, pintf ("% #x", 1) statements will print 0x1 at the terminal.
5. Create a temporary file I/O stream function
In many cases, the program creates temporary files in the form of files that may hold the intermediate result of this calculation, or it may be a backup before the critical operation, and so on. These are the benefits of temporary files.
Standard I/O provides two functions to create a temporary file
---------------------------------------------------------------------------------------------------------
Char *tmpnam (char *ptr);
Return: Pointer to a unique path name
FILE *tmpfile (void);
Returns: The file pointer if successful, or null if an error occurs
-----------------------------------------------------------------------------------------------------------
The L tmpnam function returns a valid file name that does not have the same name as any existing file. Each call to it will produce a different file name, but the maximum number of calls in a process is Tmp_max "defined in Stdio.h". If PTR is not null, the length of the string ptr is assumed to be at least l_tmpnam "defined in Stdio.h", the resulting file name is placed in the string ptr, so the return value is PTR, and if PTR is null, the resulting file name is stored in a static zone. The next time the call is called, the static zone is rewritten.
L Tmpfile creates a temporary binary file (type wb+), which is automatically deleted when the file is closed or when the program ends.
Note that Tmpnam just creates a temporary file and does not open it, so if we want to use it we must open it as quickly as possible, thus reducing the risk of another program opening the file with the same name. Tmpfile, in addition to being created, is opened in both read and write mode.
Linux Standard I/O (i)