[APUE] standard I/O Library (I), apueio
1. stream and FILE objects
System IO is for file descriptors. When a file is opened, a file descriptor is returned, and the file descriptor is used for the following operations. For the standard IO library, their operations are centered around streams.
When a stream is opened, the standard IO function fopen returns a pointer to the FILE object. This object is usually a structure that contains all the information required by the IO library to manage the stream: The file descriptor used for actual IO, the pointer to the stream cache, And the cache length, the number of characters in the cache, error marks, and so on.
We call the pointer to the FILE object (type: FILE *) as the FILE pointer.
Ii. Cache
Standard IO provides cache to minimize the number of read and write calls. Standard IO provides three types of Cache
(1) Full cache. In this case, the actual IO operation is performed only when the standard IO cache is filled. Files stored on disks are usually fully cached by standard IO. When performing the first IO operation on a stream, the related standard IO functions usually call malloc to obtain the required cache.
The term refresh (flush) indicates the write operations on the standard I/O cache. The cache can be automatically refreshed by standard I/O routines (for example, when a cache is filled up), or the flush function can be called to refresh a stream. Refresh in a UNIX environment has two meanings. In terms of the standard I/O library, refresh means that the content in the cache is written to the disk (the cache can only be partial ). In terms of the Terminal Driver, refresh indicates that data already in the cache is discarded.
(2) Row cache. In this case, when a line break is encountered in the input and output, the standard IO library performs the IO operation. This allows us to enter one character at a time (using the standard IO fputc function), but the actual IO operation is only performed after a row is written. When a stream involves a terminal (such as standard input and standard output), a typical row cache is used.
There are two restrictions on Row cache. One is that the cache length of each row is fixed because the standard IO library is used to collect the cache length of each row. Therefore, if the cache is filled up, even if no line break is written, also perform IO operations. The second is: at any time, as long as the standard input and output Library requires a stream from (a) a stream without cache (B) a row cached stream (it requires data from the inner core in advance) if the input data is obtained, the output stream of all rows cached will be refreshed. The reason for adding an extension in (B) is that the required data may already be in the cache and does not require the kernel to perform operations when the data is needed. Obviously, input (item a) in a stream without caching is required to obtain data from the kernel at that time.
(3) No cache. The standard IO library does not cache characters. Standard Error circulation often does not contain cache, which can display error information as soon as possible.
Assi c requires the following cache features:
(1) they are fully cached only when the standard input and standard output do not involve interaction devices.
(2) standard errors are never completely cached.
If you do not like the system default type of any stream, you can use the following two functions to change the cache type:
#include <stdio.h>void setbuf(FILE *fp, char *buf);int setvbuf(FILE *fp, char *buf, int mode, size_t size);
These functions must be called after the stream is opened and before any operations are performed on the stream.
You can use setbuf to enable or disable the caching mechanism. For IO with cache, the parameter buf must point to a cache with the BUFSIZ length (this constant is defined in stdio. h ). Generally, the stream is fully cached, but the stream is related to the terminal device, so some systems can set it as row cache. To disable caching, set buf to NULL.
You can use setvbuf to precisely describe the cache type. Specified by the mode parameter:
_ IOFBF full cache _ IOLBF row cache, _ IONBF without Cache
If a stream without caching is specified, the buf and size parameters are ignored. If full cache or row cache is specified, the buf and size can specify a cache and its length. If the stream is cached and the buf is NULL, the standard IO library automatically allocates a cache with an appropriate length for the stream, the proper length is the value specified by st_blksize In the struct stat struct. If the system cannot determine this value for the stream (for example, the stream involves a device or a pipe), the cache with the length of BUFSIZ is allocated.
The following table lists the actions of these two functions and their selection items.
If a function is assigned a standard IO cache for the automatic variable class, the stream must be closed before the function returns. SVR4 uses a portion of the cache for ta's own management operations. Therefore, the actual number of data bytes that can be stored in the cache is smaller than the size. In general, the system should select the cache length and automatically allocate the cache. In this case, the standard IO library will automatically close the release cache when closing the stream.
You can force refresh a stream at any time:
#include <stdio.h>int fflush(FILE *fp);
This function transfers all data of the stream to the kernel. If fp is NULL, this function refreshes all output streams.
3. Open a stream
# Include <stdio. h> FILE * fopen (const char * pathname, const char * type); FILE * freopen (const char * pathname, const char * type, FILE fp ); FILE * fdopen (int fileds, const char * type); Return Value: Successful FILE pointer, failure NULL
Differences between the three functions:
(1) open the file pathname in fopen.
(2) freopen opens the pathname file on a specific stream (specified by fp. If the stream has been opened, it is disabled first. This function is generally used to open a file into a predefined stream: standard input, standard output, and standard error.
(3) fdopen obtains an existing file descriptor (this file descriptor may be obtained from the open \ dup \ dup2 \ fcntl \ pipe function) and combine a standard IO stream with the file descriptor. This function is commonly used for the insert operator obtained by the pipeline creation and network communication channel functions. Because these special types of files cannot be opened with standard I/O fopen
The type parameter specifies the read and write modes for the IO stream. ansi c specifies that the type parameter can have 15 values:
The use of character B as part of the type allows the standard IO system to distinguish between text files and binary files. Since UNXI does not distinguish these two types of files, specifying B as part of the type in UNIX does not actually work.
For fdopen, the meaning of the type parameter is somewhat different. Because the file descriptor has been opened, fdopen does not shorten the file for writing. In addition, the standard IO addition method cannot be used to create a file, because if a file descriptor references a file, the file must already exist.
When a file is opened with the add type, data is written to the end of the file at the end of each write. If multiple processes open the same file in standard I/O mode, data from each process is correctly written to the file.
When you open a file as a read or write object (type +), the following restrictions apply:
If there is no fflush, fseek, fsetpos, or rewind in the middle, the input cannot be followed directly after the output.
If there is no fseek, fsetpos, or rewind in the middle, or an output operation does not reach the end of the file, the output cannot be followed directly after the input operation.
The following table shows six different ways to open a stream:
When you create a new file of the specified type w or a, the access permission of the file cannot be specified. POSIX.1 requires that the file created in this way have the following permissions:
S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH
Unless the stream references the terminal device, it is fully cached when it is turned on by default. If a stream references a terminal device, the stream is a row cache.
Close an open stream with fclose:
# Include <stdio. h> int fclose (FILE * fp); Return Value: Success 0, error EOF
Refresh the output data in the cache before closing the file. The input data in the cache is discarded. If the standard IO library has automatically assigned a cache for the stream, the cache will be released.
When a process ends normally (the exit () function is called directly or returned from the main function), all standard IO streams with unwritten cache data are refreshed, all open standard IO streams are disabled.
Iv. Read and Write streams
Once a stream is enabled, you can select among the following three types of unformatted IO to read and write it:
(1) IO of each character. One read or write character at a time. If the stream is cached, the standard IO function processes all the caches.
(2) IO of each row. Use fgets and fputs to read or write a row at a time. Each line ends with a new line character. When you call fgets, you must specify the maximum capacity that can be processed.
(3) Direct IO. Fwrite and fread functions support this type of IO. Each IO operation reads or writes a certain number of objects. Each object has a specified length. These two functions are often used to read or write a structure from a binary file.
The term direct IO comes from the ansi c standard and is sometimes referred to as binary IO, an object IO at a time, record-oriented IO, or schema-oriented IO.
1. Input Function
The following three functions are used to read one character at a time:
# Include <stdio. h> int getc (FILE * fp); int fgetc (FILE * fp); int getchar (void); Return Value: success is the next character, if the file is already at the end of the file or an error occurs, it is EOF
Getchar is equivalent to getc (stdin ). The difference between the first two functions is that getc can be implemented as a macro, while fgetc cannot. (The implementation here is macro, that is, the implementation of getc in most UNIX systems is as follows: in <stdio. h> in # define getc (FILE * fp) xxx (FILE * fp), that is, getc is not a function but a macro. This means:
(1) The getc parameter should not be an expression with side effects.
(2) Because fgetc must be a function, you can obtain its address. This allows the fgetc address to be transmitted to another function as a parameter.
(3) It may take longer to call fgetc, because it usually takes longer to call a function than to call a macro.
The three functions return the next character in the form of unsigned char type conversion to int. The reason that int is returned is that the function can return a negative value (indicating an error or has reached the end of the file) in <studio. h> in the constant EOF, a negative value is often required and its value is-1. Therefore, the return values of these three functions cannot be put into a character variable.
Whether an error occurs or the end of the file, the three functions return the same value. To distinguish between the two cases, you must call ferror or feof.
# Include <stdio. h> int ferror (FILE * fp); int feof (FILE * fp); Return Value: if the condition is true, a non-0 value is returned. Otherwise, 0 void clearerr (FILE * fp) is returned );
In most implemented FILE objects, two flags are maintained for each stream.
After reading a stream, you can call ungetc to send the characters back to the stream.
#include <stdio.h>int ungetc(int c, FILE *fp);
The characters sent back to the stream can be read from the stream, but the order of the characters is the opposite of that sent back. The return character is not necessarily the character that was last read. EOF cannot be sent back. However, when the end of the file is reached, a character can still be returned. This character is returned for the next read, and EOF is returned for the next read. The reason for this is that a successful ungetc call will clear the end of the file of the stream.
When you are reading an input stream and performing some form of word or mark-breaking operations, the return character operation is often used. Sometimes you need to take a look at the next character to determine how to process the current character. Then, you need to conveniently return the character you just viewed so that this character can be returned the next time you call getc.
2. Output Functions
# Include <stdio. h> int putc (int c, FILE * fp); int fputc (int c, FILE * fp); int putchar (int c); Return Value: c returned successfully, error EOF
Putchar is equivalent to putc (c, stdout)
5. IO of each line
The following two functions provide the ability to input a row each time:
#include <stdio.h>char *fgets(char *buf, int n, FILE *fp);char *fgets(char *buf);
Gets reads data from standard input. For fgets, the length of the cached buf must be specified as n. This function keeps reading the next new line, but cannot exceed n-1 characters. The characters read are sent to the cache. The cache ends with null characters. If the number of characters in the row including the last New Line Character exceeds n-1, only an incomplete row is returned, and the cache always ends with a null character.The next call to fgets will continue the row.
The gets function is not recommended because the cache length cannot be specified,Gets does not store new line characters in the cache.
# Include <stdio. h> int fputs (const char * str, FILE * fp); int puts (const char * str );
Returned value: a non-negative value is returned successfully. An error occurred while returning the EOF.
The fputs function writes a string terminated with the null character to the specified stream. The Terminator null does not write the string. This is not necessarily a line output every time, becauseIt does not requireIt must be a new line character before the null character, but it is usually a new line character before the null character.
Puts writes a string terminated with a null character to the standard output. HoweverPuts then writes a newline to the standard output..