Before discussing how to operate a C-language file, let's take a look at the file-related items.
I. Text Files and binary files
Definition of a text file: a computer file consisting of several lines of characters, which is stored in a computer system. The end identifier of the file is usually placed in the last line of the file. Text files can only store valid character information in files, but cannot store images, sounds, and other information. In a narrow sense, binary files refer to files, files, and doc files other than text files.
In fact, both the text files and binary files defined above are stored in binary form on computers, so there is no difference in nature. Therefore, binary files in a broad sense refer to all files. As to why binary data is stored in a computer, and what is presented to us is indeed text, images, and other information. This is related to the composition of computer hardware, because the components in the computer are transistors, there are only two stable states, so binary 0 and 1 can represent their States. The combinations of many transistors in different States present different information to us. The following uses the representation of Chinese characters in a computer as an example.
2. Representation of Chinese Characters in computers
To process Chinese character information with a computer, the Chinese character must be encoded into a binary that can be recognized by the computer. There are three types of Chinese character encoding: input code, inner code, and font code. They have different functions.
Input code: In order to directly use the standard western English keyboard to input Chinese characters, corresponding encoding rules must be developed, such as pinyin code (pinyin input method) and digital code (Digital Input Method ).
Inner code: indicates the representation of Chinese characters in the computer, that is, binary. Generally, two bytes are used to represent a Chinese character. The maximum bit of each byte is set to 1 (the value is negative ), for example, the Chinese character "I" is expressed as 11001110 11010010 in the computer.
Font code: the font information of a Chinese character stored in a computer must be displayed on the screen or output on a printer. The font code of a Chinese character cannot represent the font information of a Chinese character, therefore, a special font code is required. The most common form of font information display is in the form of dot matrix, that is, the font of Chinese characters is divided into several "points" to form a dot matrix. Each vertex has two types of information: black and white, and a stroke is represented by black. The dot matrix information of Chinese characters is very large. For example, if a 16*16 dot matrix needs to use 256 bits to represent its information, it requires 32 bytes of space.
The font library, also known as the font library, that stores all Chinese characters in the computer, when a Chinese character is output or displayed, a special font retrieval program will find the corresponding font code in the font library based on the internal code of the Chinese character, then, output to the display device based on the font code.
Therefore, the text files or images we usually see are stored in binary format on computers, but are displayed in a way that can be recognized by people.
Test procedure
# Include <stdio. h> # include <string. h> int main (void) {char s [] = ""; unsigned char * p = (unsigned char *) s; printf ("% d \ n ", strlen (s); printf ("% X \ n", * p); printf ("% X \ n", * (p + 1); return 0 ;}
Output result:
2
CE
D2
Press any key to continue
Author Hai Zi