First, let's review the ASCII code table that you generally think is not worth mentioning.
The ASCII code is generally divided into three parts:
- Non printable,System codes between 0 and 31.
- Lower ASCII (standart ASCII), Between 32 and 127. this part of the table (as shown below) originates from older, American systems, which worked on 7-bit character tables. foreign letters, like and were not available then.
- Higher ASCII (Extended ASCII), Between 128 and 255. This part is programmable, in that you can exchange characters based on language you want to write in. Foreign letters are placed in this part and an example is shown below.
ASCII functions can be roughly dividedPart 3.
Part 1There are 32 characters in total from 00 h to 1fh, which are generally used for controlling or controlling. Some characters can be displayed on the screen, some of them cannot be displayed on the screen, but can see the effect (such as the line character and character ). See the following table:
Part 2There are 96 characters in total from 20 h to 7 FH. These 95 characters are used to represent Arabic numerals, English letters and sizes, and base indexes, including symbols, etc, can be displayed on the screen. See the following table:
Below is the standard ASCII characters.
| Dec |
Char |
Dec |
Char |
Dec |
Char |
Dec |
Char |
Dec |
Char |
Dec |
Char |
| 33 |
! |
49 |
1 |
65 |
A |
81 |
Q |
97 |
A |
113 |
Q |
| 34 |
" |
50 |
2 |
66 |
B |
82 |
R |
98 |
B |
114 |
R |
| 35 |
# |
51 |
3 |
67 |
C |
83 |
S |
99 |
C |
115 |
S |
| 36 |
$ |
52 |
4 |
68 |
D |
84 |
T |
100 |
D |
116 |
T |
| 37 |
% |
53 |
5 |
69 |
E |
85 |
U |
101 |
E |
117 |
U |
| 38 |
& |
54 |
6 |
70 |
F |
86 |
V |
102 |
F |
118 |
V |
| 39 |
' |
55 |
7 |
71 |
G |
87 |
W |
103 |
G |
119 |
W |
| 40 |
( |
56 |
8 |
72 |
H |
88 |
X |
104 |
H |
120 |
X |
| 41 |
) |
57 |
9 |
73 |
I |
89 |
Y |
105 |
I |
121 |
Y |
| 42 |
* |
58 |
: |
74 |
J |
90 |
Z |
106 |
J |
122 |
Z |
| 43 |
+ |
59 |
; |
75 |
K |
91 |
[ |
107 |
K |
123 |
{ |
| 44 |
, |
60 |
< |
76 |
L |
92 |
\ |
108 |
L |
124 |
| |
| 45 |
- |
61 |
= |
77 |
M |
93 |
] |
109 |
M |
125 |
} |
| 46 |
. |
62 |
> |
78 |
N |
94 |
^ |
110 |
N |
126 |
~ |
| 47 |
/ |
63 |
? |
79 |
O |
95 |
_ |
111 |
O |
127 |
_ |
| 48 |
0 |
64 |
@ |
80 |
P |
96 |
' |
112 |
P |
|
|
|
|
Part 3There are a total of 128 characters from 80 h to 0ffh. Generally, they are regarded as "character filling". These 128 character Filling Characters are ASCII characters developed by IBM and are not standard. These characters are used to represent frames, audios, and other non-English letters in mainland China.
I. Text Files and binary files.
As we all know, computers are physically stored in binary. Therefore, the difference between a text file and a binary file is not physical, but logical. The two are only different at the encoding level.
In short, text files are character-encoded files. Common encodings include ASCII and Unicode.
binary files are value-encoded files. You can specify the meaning of a value based on the specific application (such a process can be considered as custom encoding ).
we can see from the above that text files are basically fixed-length encoded. Based on characters, each character is fixed in the specific encoding, And the ASCII code is 8 bits encoded, unicode generally occupies 16 bits. The binary file can be regarded as a variable-length encoding, because it is a value encoding. It is up to you to decide how many bits represent a value. If you are familiar with BMP files, take it as an example. the header of the file is a fixed-length file header. The first two bytes are used to record the file in BMP format, the next 8 bytes are used to record the file length, and the next 4 bytes are used to record the length of the BMP file header... We can see that the encoding is based on the value (not long, 2, 4, 8 bytes long value), so BMP is a binary file.
2. Access to text files and binary files
what is the process of opening a file using a text tool? Taking notepad as an example, it first reads the binary bit stream corresponding to the physical file (as mentioned earlier, the storage is binary ), then explain the stream according to the decoding method you selected, and then display the interpretation result. In general, the decoding method you select will be in the ASCII code format (one character of the ASCII code is 8 bits). Next, it will explain the file stream with 8 bits and 8 bits. For example, for such a file stream "0000000_010000000000000010_0000011" (The underscore '_' is manually added to enhance readability ), if the first 8-bit '000000' is decoded Based on the ASCII code, the corresponding character is 'A'. Similarly, the other 3 8-bit can be decoded as 'bcd ', that is, the file stream can be interpreted as "ABCD", and then the "ABCD" is displayed on the screen in notepad.
In fact, everything in the world needs to communicate with other things, and there is an established protocol, an established code. People communicate with each other through text. The Chinese character "mom" represents the person who gave birth to you. This is an established code. But I noticed that the Chinese character "mom" may be the one you gave birth to in Japanese, therefore, when a Chinese user a communicates with Japanese user B using the word "Mom", it is normal to have misunderstandings.
Using notepad to open a binary file is similar to the above. No matter what files are opened in notepad, they all work according to the established character encoding (such as the ASCII code). So when he opens a binary file, garbled characters are also inevitable, decoding does not match decoding. For example, the file stream '127 _ 00000000_00000000_00000001 'may correspond to a four-byte integer int 1 in the binary file. In notepad, the four controllers "null_null_null_soh" are interpreted.
The storage and reading of text files are basically a inverse process and will not be described. The access to binary files is obviously similar to that of text files, but the encoding and decoding methods are different and will not be described.
Iii. Advantages and Disadvantages of text files and binary files
Because the differences between text files and binary files are only differences in encoding, their advantages and disadvantages are the advantages and disadvantages of encoding. It is clear to look at this encoding book. It is generally believed that the encoding of text files is based on the Character length, and the decoding is easier. The encoding of binary files is longer, so it is flexible and the storage utilization is higher, decoding is difficult (different binary file formats have different decoding methods ). About space utilization, think about it. A binary file can even use a bit to represent a meaning (bit operation), and any meaning of a text file must be at least one character.
Many books also believe that text files are easier to read, and storage takes time to convert (compiling code is required for reading and writing), while binary files are less readable. There is no conversion time for storage (not codec for reading and writing, directly write the value ). the readability here is from the perspective of software users, because we can use a general notepad tool to browse almost all text files, so the text files are quite readable; reading and writing a specific binary file requires a specific file decoder. Therefore, the readability of the binary file is poor. For example, to read a BMP file, you must use the image reading software. the storage conversion time here should be from the programming point of view, because some operating systems such as windows need to convert the carriage return line break (convert '\ n ', change to '\ r \ n'. Therefore, when reading and writing a file, the operating system needs to check whether the current character is' \ n' or '\ r \ n' one by one '). this is not required for storage and conversion in the Linux operating system. Of course, when files are shared on two different operating systems, this storage conversion may come out again (such as sharing text files in Linux and Windows ).
From the programming point of view, we treat the two files in the same way, that is, they are both 01 codes, but logically they are explained differently.
IV. C ++ file I/0
C ++ uses the standard stream as the I/O
<To be continued...>