Linux File Operations (2)

Source: Internet
Author: User
Tags symlink
Standard I/O Library
The standard I/O Library and its header files provide a 10 thousand interface called by the underlying I/O system. this library is not part of ANSI Standard C, and the system call we mentioned earlier is not. However, this library provides many complex functions to process formatted output and description input. he also takes care of the buffer required by the device.
In many ways, we can use this library using the low-level file descriptor. we need to open the file to create an access path. this returns a value and serves as a parameter for calling other I/O library functions. stream, which is equivalent to the low-level file descriptor, and is implemented as a pointer to the structure, file.
When a program starts, three file streams are automatically opened. they are stdin, stdout, stderr. these are in stdio. h, which represents the standard input. the standard output and standard error output correspond to the low-level file descriptors 0, 1 and 2 respectively.
In the next section, we will see the following content:
Fopen, fclose
Fread, fwrite
Fgetc, GETC, getchar
Fputc, putc, putchar
Fgets, gets
Printf, fprintf, sprintf
Scanf, fscanf, sscanf
Fopen library functions are used to simulate low-level open system calls. we mainly use it for file or terminal input and output. however, where we need to display the control devices, we 'd better use lower-layer system calls because they can remove potential adverse factors caused by libraries, such as the input/output buffer.
The syntax format is as follows:
# Include <stdio. h>
File * fopen (const char * filename, const char * mode );
Fopen open the file specified by the filename parameter, and create a stream. mode parameter to indicate how to open the file. It can be one of the following strings:
"R" or "rb": open in read-only mode
"W" or "WB": open in write-only mode
"A" or "AB": open in Read mode and add it to the end of the file
"R +" or "RB +" or "R + B": Enable Update (read and write)
"W +" or "WB +" or "W + B": Enable the update and change its length to zero.
"A +" or "AB +" or "A + B": Open the update and add it to the end of the file.
B indicates that the file is a binary file rather than a text file.
Here we should note that, unlike MS-DOS, UNIX and Linux won't be different between text files and binary files. UNIX and Linux regard all files as the same, especially binary files. note that the mode parameter must be a string rather than a character. we always need to use "R" instead of 'R '.
If the function is successfully called, fopen returns a non-null file pointer. If the function fails, fopen returns NULL, which is defined in stdio. h.
The fread library function can be used to read data from a file stream. data read from stream will be placed in the data buffer specified by PRT. both fread and fwrite process data records. these are based on the block size and read times nitems to specify the record block to be transferred. if successful, the return value is the number of blocks actually read into the data buffer, rather than the number of bytes. at the end of the file, less than the value of nitems may be returned, including zero.
The syntax format is as follows:
# Include <stdio. h>
Size_t fread (void * PTR, size_t size, size_t nitems, file * stream );
Like all standard I/O functions to be written into the buffer, programmers are responsible for allocating data space and checking for errors.
Fwrite calls a function interface similar to fread. It reads data records from the specified data zone and writes them to the output stream. Its return value is the number of records successfully written.
The syntax format is as follows:
# Include <stdio. h>
Size_t fwrite (const void * PTR, size_t size, size_t nitems, file * stream );
Here, we should note that we do not recommend using fread and fwrite in structure data. this is partly because the files written with fwrite may be incompatible with different machines.
The fclose function closes the specified file stream and writes all unwritten data to the file. using fclose is very important because the stdio library caches data. if the program needs to confirm that all data has been fully written, it should call fclose. however, when a program ends normally, fclose is automatically called to close all file streams that are still open. of course, in this case, we have no chance to check the errors reported by fclose. as with all file descriptor restrictions, the number of available streams is also limited. the actual limit is fopen_max, which is in stdio. h, and at least 8.
The syntax format is as follows:
# Include <stdio. h>
Int fclose (File * stream );
The fflush database function allows all data not written to the file stream to be immediately written to the file stream. for example, we can use this function to ensure that the interaction prompt has been sent to the terminal before trying to read an input. this function ensures that all important data has been written to the disk file before continuing the operation. during program debugging, we can also use this function to ensure that the program is writing files rather than performing null operations. in addition, we should note that the flush operation is implicitly called when we call fclose, so we do not need to call fflush before fclose.
The syntax format is as follows:
# Include <stdio. h>
Int fflush (File * stream );
The fseek function is a function operation equivalent to the lstat system call. it sets the next read or write location in the file stream. the meaning and usage of offset and Whence are the same as those of lseek we mentioned earlier. however, lseek returns off_t, while fseek returns an integer: if the operation succeeds, 0 is returned,-1 is returned if the operation fails, and errno is used to indicate an error. therefore, this will be more standard.
The syntax format is as follows:
# Include <stdio. h>
Int fseek (File * stream, long int offset, int whence );
Fgetc, GETC, getchar
The fgets function returns the next byte from a file stream as a character. when it reaches the end of the file or an error occurs, it returns EOF. we must use ferror or feof to identify the two cases.
The syntax format is as follows:
# Include <stdio. h>
Int fgetc (File * stream );
Int GETC (File * stream );
Int getchar ();
The GETC function has the same function as the fgetc function, except that the former is implemented as a macro. In this case, the stream parameter must have no side effects (for example: it cannot affect local variables or variables passed to the function as parameters ). at the same time, we cannot use the GETC address as a function pointer.
The getchar function is the same as the GETC (stdin) function and reads the next character from the standard input.
Fputc, putc, putchar
The fputc function writes a character to the output file stream. It returns the value it writes. If it fails, it is EOF.
The syntax format is as follows:
# Include <stdio. h>
Int fputc (int c, file * stream );
Int putc (int c, file * stream );
Int putchar (int c );
Similar to fgetc/GETC, The putc function functions the same as fputc, but it may be implemented as a macro. the gnu c compiler does this, and we can. h. His definition is displayed in the header file.
The putchar function works the same as putc (C, stdout) and writes a separate character to the standard output. here, we should note that putchar treats the character as an integer rather than a character, which is the same as the result of the returned character of getchar. in this way, the end tag (EOF) can be marked with-1 when the number of characters is exceeded.
Fgets, gets
The fgets function reads a string of characters from the input file stream. he will place the characters he reads into the position pointed by s until a new line is reached, and the characters will be transmitted with n-1 characters, or will be returned at the end of the file, this is the first case. any line break will be sent to the receiving string, and an end byte/0 will be added. in any call, a maximum of N-1 characters can be transferred, because an empty byte must be added to end the character to constitute n characters.
The syntax format is as follows:
# Include <stdio. h>
Char * fgets (char * s, int N, file * stream );
Char * gets (char * s );
If the function is successfully called, a pointer to S is returned. if the stream reaches the end of the file, it will set an EOF token for the stream and return a null pointer. if a read error occurs, fgets returns a null pointer and sets errno to indicate the error type.
The gets function is similar to fgets, except that the former reads a string from the standard input and ignores any linefeed. It adds an terminator to the received character.
Here, we should note that the gets function does not limit the number of characters transmitted, so it will exceed their transmission buffer. therefore, we should avoid using this function and replace it with the fgets function. so we should be careful to use this function.
Format Input and Output
Many library functions can generate outputs as expected. If we have some C programming experience, we will be familiar with these formats. these functions include prinf and other functions that write data to the file stream, as well as scanf and other functions that read data from the file stream.
Printf, fprintf, sprintf
The printf function family can format and output variable parameters of different types. every function represented in the output stream is controlled by the format parameter. This parameter contains the common string and code to be printed, that is, the part of the escape character, these are used to indicate how and where to print the remaining parameters.
The syntax format is as follows:
# Include <stdio. h>
Int printf (const char * format ,...);
Int sprintf (char * s, const char * format ,...);
Int fprintf (File * stream, const char * format ,...);
The printf function generates its output on the standard output. the fprintf function generates its output on a specified stream, while the sprintf function writes its output and an ending null character to the string s, which is passed as a parameter. this character must be large enough to contain all the outputs. in addition, some other functions in the printf function family can be used to process different parameters in different ways. you can view the printf manual page for more details.
Normal characters will not change after being passed to the output. escape characters will make printf retrieve and format other parameters passed. They can usually start with % characters. The following examples:
Printf ("Some numbers: % d, % d, and % d/N", 1, 2, 3 );
The output result in the standard is as follows:
Some numbers: 1, 2, and 3
If we want to print a % character, we must use % to avoid confusion with an escape character.
The following are some of the most common escape characters:
% D, % I: print an integer in decimal number
% O, % x: printed in octal, hexadecimal format
% C: print a character
% S: print a string
% F: print a floating point number (single precision)
% E: print a number of Double Precision values in the format of number of points
% G: print a double number in normal format
It is very important to specify the parameter types and numbers that match the escape characters passed to the printf function in the format string. an optional dimension identifier can be used to indicate the type of Integer Parameters. this can be H. For example, % HD is used to indicate short Int or L. For example, % LD is used to indicate long Int. some compilers can check the parameters of printf, but they are not absolutely reliable. if we are using the GNU Compiler GCC, we can use-wformat to do this.
Char initial = 'a ';
Char * surname = "Matthew ";
Double age = 14.5;
Printf ("Hello Miss % C % s, aged % G/N", initial, surname, age );
The result of this example is as follows:
Hello Miss a Matthew, aged 14.5
If we use the domain ID, we can control more printing methods. these escape characters are extended to control spaces in the output. A common usage is to specify a decimal space for floating point printing or a printing space for a string.
The Field identifier is specified as a number after the % characters of the escape character. The following table contains examples of escape characters and their output results.
Format argument | output |
% 10 s "Hello" | Hello |
%-10 s "Hello" | Hello |
% 10 days 1234 | 1234 |
%-10D 1234 | 1234 |
% 010ge 1234 | 0000001234 |
% 10.4f 12.34 | 12.3400 |
% * S 10, "hello" | Hello |
All of these examples are printed with a width of 10 characters. here, we should note that the negative value in the field width indicates that the printed content must be left aligned. you can use a wildcard * to specify the width of a changed area. in this case, the next parameter is used to specify the width. 0 indicates that the content to be printed starts with 0. according to POSIX, the printf function does not cut off a field to be printed, but expands to fill the field. therefore, if we want to print a content longer than the specified domain, the domain will grow.
See the following table:
Format argument | output |
% 10 s "hellotherepeeps" | hellotherepeeps |
Printf returns an integer indicating the number of written characters. This does not include the ending character null in the sprintf function. If an error occurs, a negative number is returned and errno is set.
Scanf, fscanf, sscanf
The working methods of the scanf function family are similar to those of the printf group, the difference is that these functions read content from a stream or place variable values at the pointer address passed as a parameter. they use format strings to control input conversion in the same way, and many of these escape characters are the same.
The syntax format is as follows:
# Include <stdio. h>
Int scanf (const char * format ,...);
Int fscanf (File * stream, const char * format ,...);
Int sscanf (const char * s, const char * format ,...);
The most important thing here is that the variables used to store the values read by the scanf function must be of the correct type and must be exactly matched with the format string. if this is not the case, our memory will leak and our program may crash. these will not cause compilation errors. If we are lucky, we may receive a warning.
The format strings of scanf and its related functions contain common and escape characters, which are similar to printf. However, common characters are used to specify the characters that must appear in the input.
The following is a simple example:
Int num;
Scanf ("Hello % d", & num );
This scanf call is successful only when the next five characters in the standard input match hello. then, if the next character forms an identifiable decimal number, the number will be read and its value will be assigned to the variable num. spaces in the format string are used to ignore any space characters (spaces, tabs, or new lines) between the escape characters ). this means that if we specify any of the following input formats, it will succeed and 1234 will be saved to the variable num:
Hello 1234
Generally, when the escape starts, the space character is ignored. this means that the format string % d will be read from the input all the time, skipping any spaces and new lines until a number column is found. if no expected character is displayed, the escape will fail, and the scanf function will return.
If we are not careful, this will lead to problems. If we read integers in our program and the input does not contain numbers, this will lead to an endless loop.
Some other escape characters are as follows:
% D: Read a decimal integer
% O, % x: read an octal integer in hexadecimal format.
% F, % E, % G: Read a floating point number
% C: Read a character (spaces are not skipped)
% S: Read a string
% []: Read A Character Set
%: Read a % character
Similar to printf, scanf escape characters also have a width field to limit the number of inputs. A dimension identifier (h indicates Shor, and l indicates long) indicates whether a received parameter is shorter than or longer than the default condition. this means that % HD indicates short int, while % LD indicates long Int. % LG indicates the double-precision floating point number.
If an identifier starts with *, it indicates that all content will be ignored. This means that the entered information will not be saved, so we do not need a variable to receive it.
We use % C to read a single character from the input, which does not skip the initial space character.
We use % s to read a string, but we must be careful. it skips the leading space character, but stops at the first space character in the string. so we 'd better use it to read a word rather than a regular string. the region width identifier is not specified at the same time, so the length of the string he may read is not limited, so the receiving string must be large enough to store the longest string in the input stream. it is best to use the area width identifier, or use fgets and sscanf in combination to read a row of input. in this way, we can try to prevent buffer overflow caused by malicious users.
The % [] Mark can be used to read a string consisting of a character set. format String % [A-Z] can read a string consisting of uppercase letters. if the first character in this set is ^, it reads a string composed of characters not in the set. therefore, if you want to read a string that contains spaces but ends at the first comma, You Can format the string % [^,].
We can enter the following input line:
Hello, 1234, 5.678, X, string to the end of the line
The scanf call correctly reads four contents:
Char s [256];
Int N;
Float F;
Char C;
Scanf ("Hello, % d, % G, % C, % [^/n]", & N, & F, & C, S );
The scanf function returns the number of successfully read content. If the first content fails, a zero value is returned. if it matches the first content before it reaches the end of the input, EOF is returned. if a read error occurs on the file stream, the file stream error mark is set, and the error variable errno is set to indicate the error type.
Generally, scanf and related functions are not recommended for use. This is due to the following three reasons:
The traditional reason is that the implementation of these functions has some bugs.
Their use is not flexible
They will make it difficult to understand the analytical program.
We can try to use some other functions, such as fread or fgets.
Other stream Functions
There are also many other stdio library functions that use either stream parameters or standard stdin, stdout, and stderr parameters:
Fgetpos: Get the current address in the file stream
Fsetpos: set the current address in the file stream
Ftell: returns the offset of the current file in a stream.
Rewind: resetting the file address in a stream
Frepoen: Reuse a file stream
Setvbuf: Set a buffer scheme for a stream
Remove: It is equivalent to unlink. The difference is that its parameter is a directory. In this case, it works the same as rmdir.
These library functions are detailed in the third part of the book page.
Now we can use the file stream function to re-implement a file copy program. Here we will use the library function. Let's look at the example program copy_stdio.c.
This program is similar to the previous version of the program, but now the copy of one character and one character is implemented by the function reference in stdio. h:
# Include <stdio. h>
# Include <stdlib. h>
Int main ()
Int C;
File * In, * out;
In = fopen ("file. In", "R ");
Out = fopen ("file. Out", "W ");
While (C = fgetc (in ))! = EOF)
Fputc (C, out );
Exit (0 );
If we want to run this program as before, we can get the following output information:
$ Timeformat = "time copy_stdio
0.29 user 0.02 system. 35 elapsed 87% CPU
This time, the program runs for 0.35 seconds, not as fast as the underlying block version, but it is much better than copying one character at a time. this is because the stdio library uses the file structure to maintain an internal buffer, and the underlying system call is called only when the buffer is full. we can use the copy and block copy operations of one row and one row to compare them with the versions we run here.
Stream Error
To identify an error, many stdio library functions return an out-of-bounds value, such as a null pointer or a fixed value EOF. in these cases, these errors are identified by the external variable errno:
# Include <errno. h>
Extern int errno;
Note that many functions change the value of errno. the value of a function is available only when it fails to be called. we should immediately check the value of errno after a function mark fails. we should copy his value to another variable before using it, because some print functions such as fprintf may modify his value.
We can also check the file stream status to determine whether an error has occurred or whether it has reached the end of the file.
# Include <stdio. h>
Int ferror (File * stream );
Int feof (File * stream );
Void clearerr (File * stream );
The ferror function is a stream detection error identifier. If it is set, zero is returned; otherwise, a non-zero value is returned. We can use this function as follows:
If (feof (some_stream ))
/* We're at the end */
The clearerr function clears the end or error identifier of the file stream indicated by the stream pointer. this function has no returned value or defined errors. we can use this function to restore the stream by error conditions. an example of this function may be that data will be re-written to the file stream when the disk is full.
Stream and file descriptor
Each file stream corresponds to the underlying file descriptor. we can mix the underlying input and output with the high-level file stream operations, but it is generally unwise because the impact of the buffer zone is unpredictable.
# Include <stdio. h>
Int fileno (File * stream );
File * fdopen (INT Fildes, const char * mode );
We can call the fileno function to know which underlying file descriptor a file stream is using. it returns a file descriptor for the specified file stream. If it fails, it returns-1. if we need to access an open stream at the underlying layer, we can use this function, for example, fstat.
We can call the fdopen function to create a new file stream based on an opened file descriptor. in essence, this function will provide a stdio buffer for an opened file descriptor, which may be a simple method for interpretation.
The fdopen function is similar to the fopen operation method. The difference is that it uses an underlying file descriptor. if we need to use open to create a file, it may be for better permission control, but we want to use the file stream for write operations, this function will be particularly useful. the mode parameter is the same as the fopen function parameter, and must be compatible with the file access method created when the file is initially opened. fdopen returns a new file stream. If it fails, null is returned.
File and directory maintenance
Standard libraries and system calls provide complete control over File Creation and maintenance.
We can use the CHMOD system call to change the permissions of a file or directory, which constitutes the basic content of shell programming.
The syntax is as follows:
# Include <sys/STAT. h>
Int chmod (const char * path, mode_t mode );
The file specified by path will have the permission specified by mode. the mode specified here is the same as that in the open system call. It is a bit of the required permission or. unless appropriate permissions are specified for the program, only the owner or super user of the file can change the permissions.
Super Users can use chown system calls to change the owner of a file.
The syntax is as follows:
# Include <unistd. h>
Int chown (const char * path, uid_t owner, gid_t group );
This call uses the user ID or group ID value (which can be called by getuid and getgid) and a constant to determine who can change the file owner. if appropriate permissions are set, the user and group of a file can be changed.
Unlink, Link, symlink
We can use unlink to remove a file.
Unlink can remove directory objects from a file and reduce the number of connections. if the function is successfully called, 0 is returned. If the function fails,-1 is returned. we must have the write and execution permissions in the directory where the command is to be executed, because the file has its own directory entity for this function call.
The syntax is as follows:
Int unlink (const char * path );
Int Link (const char * path1, const char * path2 );
Int symlink (const char * path1, const char * path2 );
If the number of connections reaches 0 and no process opens the file, the file will be deleted. in fact, a directory object will always be deleted, but the space of this file will not be recycled until the last related process is closed. rm program uses this call. in general, we can use the ln program to create a link for a file. we can use the link system to create a planned link for a file.
The Link System Call creates a new link for an existing file path1. the new directory object is specified by path2. we can use symlink to create a symbolic link in a similar way. here, we should note that the symbolic link of a file does not prevent the deletion of a file as a hard link.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.