Analysis of C language file operations (5)

Source: Internet
Author: User


In C language, everyone should be familiar with a symbol, that is, EOF (End of File), that is, the File Terminator. However, this understanding is often not very clear, leading to frequent errors when writing code, especially when determining whether the file has reached the end of the file.

1. What is EOF?

View the EOF definition in VC:

# Define EOF (-1)

EOF only represents an integer constant-1. Therefore, many people think that the end mark EOF exists at the end of the file, which is wrong. In fact, this flag does not exist at the end of the file. Then someone will ask how to explain the following program?

	 char ch; while((ch=fgetc(fp))!=EOF)
    {
        printf("%c\n",ch);
    }

This code is used in books to determine whether to read to the end of a text file, that is, to end the operation when the EOF is read. This understanding is incorrect.

Let's take a look at the fgetc prototype:

Int fgetc (FILE * fp );

In fact, in the fgetc function, each time a byte of data is read, and the data of this byte is processed in unsigned form, then, the data of this byte is assigned to an int type variable as the return value. Therefore, no matter what data is read from the file, it is assigned to an int type variable as an unsigned type, the return value cannot be a negative number. For example, when 0xFA is read, because it is unsigned, when 0xFA is assigned to an int variable, the int variable is filled with 0 (why is it filled with 0, similar to the symbol extension in assembly language, which will be mentioned later), so the returned result is 0X00 00 FA, it will never be a negative number. if no data is available for reading at the end of the file, then EOF, that is,-1, which is an int type constant, is returned, the binary value is 0x FF.

The code above has great limitations, because it can only determine whether it reaches the end of the text file, but cannot accurately determine the binary file. In normal cases, the data-1 (0x FF) cannot be read in a text file, so it can be determined. However, for different binary files, it is very likely that the data in one byte is 0xFF, so the returned value is-1, but it has not reached the end of the file, resulting in incorrect judgment.

Is there a solution? You can define ch as int type.

Next we will compare the difference between the program in the lower part of the pipeline and the above program during execution.

 int ch; while((ch=fgetc(fp))!=EOF)
    {
        printf("%c\n",ch);
    }

Assume that the data read in the file is 0xFA.

The execution process of the preceding program is as follows:

Assign 0 x FA to an int variable (if it is a), then a is 0x00 00 00 FA. When return value a to the ch variable, since ch is of the char type and only has eight digits, only the lower 8 bits of a are assigned to ch. In this case, ch is 0x FA, and ch is processed as signed, the ch value must be negative.

If ch is defined as int type, the execution process is:

Assign 0 x FA to an int variable (if it is a), then a is 0x00 00 00 FA. When return value a to the ch variable, because ch is of the int type, ch is 0x00 00 FA, which is a positive number. The results of two program executions are completely different.

Next, let's take a look at the result if the read data is 0x FF (not at the end of the file at this time.

If ch is char type, when the return value is 0x00 00 FF, the lower 8 bits are given to ch, then ch is-1, in this case, it will be mistaken to reach the end of the file;

If ch is int type, when 0x00 00 00 FF is returned, the ch value is 0x00 00 00 FF. In this case, ch is not-1, it is not mistaken as the end of the file.

(Of course, the above setting must be true only when the reading fails)

Therefore, the feof. www.2cto.com function is used in many cases.

Ii. feof

The feof function prototype is

Int feof (FILE * fp );

If the object ends, a non-zero value is returned. Otherwise, 0 is returned.

Check the definition of the feof function in VC:

# Define _ IOEOF 0x0010

# Define feof (_ stream)-> _ flag & _ IOEOF)

The feof function determines whether the Mark _ flag is reached at the end of the file.

Let's take a look at this program:

#include<stdio.h>
#include<stdlib.h> int main(void)
{
    FILE *fp; int ch; if((fp=fopen("test.txt","w+"))==NULL)
    {
        printf("can not open file\n");
        exit(0);
    } for(ch=65;ch<=70;ch++)
    {
        fputc(ch,fp);
    }
    rewind(fp); while(feof(fp)==0)
    {
        ch=fgetc(fp);
        printf("%0X\n",ch);
    }
    fclose(fp); return 0;
}

The execution result is:

41
42
43
44
45
46
FFFFFFFF
Press any key to continue

Why is one more FFFFFFFF printed in the final result? Isn't data written only to the file-70?

Let's take a look at the description of the feof function in C ++ Reference (C ++ Reference is a good website with a description of all C ++ library functions, URL in the blog Home Page Link, http://www.cplusplus.com/reference ):

Checks whether the End-of-File indicator associatedStreamIs set, returning a value different from zero if it is.
This indicator is generally set by a previous operation onStreamThat reached the End-of-File.

It can be seen from the description that only when the stream associated with the file reaches the end of the file, if the read operation is performed again, the end mark of the file (the _ flag described above) will be reset.

Therefore, in the above program, after reading the data of the last byte, the end mark of the file is not set, only when the position pointer reaches the end, then the read operation occurs, at this time, no data is available for reading. Therefore,-1 is returned. Therefore, one more ffffff is displayed in the printed result. After that, _ flag is reset, at this time, the feof function can detect that it has reached the end of the file.

You can solve this problem through the following methods:

ch=fgetc(fp); while(feof(fp)==0)
    {
        printf("%0X\n",ch);
        ch=fgetc(fp);
    }

In this way, no more FFFFFFFF will be printed.

The problem of symbol extension in assembly language mentioned above is actually a category of data type conversion in C language. The following is a brief description:

Symbol extension only exists when assigning data with small characters in length to the data with small characters in length. If the data with small characters grows, you can take the low value.

Let's look at a program:

#include<stdio.h>

int main(void)
{
    unsigned char ch1=0XFF;
    char ch2=0XFF;
    char ch3=0X73;
    int a=ch1;
    int b=ch2;
    int c=ch3;
    printf("%d\n%d\n%d\n",a,b,c);
    return 0;
}


The execution result is:

255

-1

115

The reason is that the values of char, ch2, and ch3 are both char-type variables and only occupy one byte. The difference is that the values are unsigned. When you assign a value to, the Delimiter is treated as the unsigned data for processing. Therefore, when filling the high position of a, it is filled with 0. For ch2 and ch3, it is signed, note that when filling a high position, if the highest bit of ch2 is 0, it indicates that ch2 is a positive number. At this time, fill the high position with 0, and if the highest bit of ch2 is 1, then fill the high data with 1.

As shown in the results of program execution, because the maximum bit of ch2 is 1, when filling the high position of B, it will be filled with 1, then B is 0X FF; the maximum position of ch3 is 0, so the high position of c is filled with 0, so the value of c is 0x00 00 73.

Author: Hai Zi


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.