Problem description: The BinaryReader is used to read data from the file. When the BinaryReader instance is created, the encoding format is not specified and the result can be compiled. However, the following error is returned during execution: "unprocessed exception: System. argumentException: the output character buffer is too small to contain decoded characters, and the operation for encoding "Unicode (UTF-8)" is rolled back to "System. text. decoderReplacementFallback "."
Solution Process:
First, attach the code for creating the file:
BinaryWriter
1 using System;
2 using System. IO;
3
4 class binaryReader
5 {
6 static void Main ()
7 {
8 FileInfo f = new FileInfo ("BinFile2.dat ");
9 BinaryWriter bw = new BinaryWriter (f. OpenWrite ());
10
11 Console. WriteLine ("Base Stream is: {0}", bw. BaseStream );
12
13 double aDouble = 1234.67;
14 int anInt = 32141;
15 char [] aCharArray = {'A', 'B', 'C '};
16 string aString = @ "teststring ";
17
18 bw. Write (aDouble );
19 bw. Write (anInt );
20 bw. Write (aCharArray );
21 bw. Write (aString );
22 bw. Close ();
23
24}
25}
The BinaryReader test code is attached:
BinaryReader
1 using System;
2 using System. IO;
3 using System. Text;
4
5 class binaryReader
6 {
7 static void Main ()
8 {
9 FileInfo f2 = new FileInfo ("BinFile2.dat ");
10
11 BinaryReader br = new BinaryReader (f2.OpenRead ());
12 // BinaryReader br = new BinaryReader (f2.OpenRead (), Encoding. Default );
13
14 int temp = 0;
15
16 while (br. PeekChar ()! =-1)
17
18 {
19 Console. Write ("{0, 7: x}", br. ReadByte ());
20
21 if (++ temp = 4)
22 {
23 Console. WriteLine ();
24 temp = 0;
25}
26
27}
28 Console. WriteLine ();
29}
30}
The following error message is displayed:
Output The hexadecimal encoding of the first character from top to top, and an error is reported for the rest. However, I think "the character buffer is too small" is a very strange error. I searched the internet to see how others did it.
For the first time, I saw a solution on CSDN. As the question says, when creating a BinaryReader instance, I specified the encoding method, just like the line commented out in the code above, the problem can be solved, and the hexadecimal encoding of all characters can be normally output.
In this way, the problem is first concentrated on encoding. The default encoding method is incorrect and must be specified to avoid errors. What encoding is feasible and what encoding is problematic? In Encoding, six Encoding methods are enumerated: UTF7, UTF8, Unicode, BigEndianUnicode, UTF32, and Default. Here, Default refers to System. Text. DBCSCodePageEncoding. Next, I did a test to enumerate every encoding method. I tried it one by one in the above Code. The results showed that on the BinFile2.dat test file I wrote, except for the UTF-8 failed to run, each of the other methods was successful (omitted here ). Then, it can be inferred that the constructor without encoding specified BinaryReader uses the UTF-8 encoding by default, and thus there is a problem in the reading process.
Now, at least we know which encoding should be used.
Further, in the function block, during file reading, which function call requires "strict" encoding? In the preceding function block, only two methods are called. One is PeekChar () in the while statement and the other is Console. WriteLine (). I think the latter is unlikely, so I did the following test:
BinaryReader2
1 using System;
2 using System. IO;
3 using System. Text;
4
5 class binaryReader
6 {
7 static void Main ()
8 {
9 FileInfo f2 = new FileInfo ("BinFile2.dat ");
10
11 BinaryReader br = new BinaryReader (f2.OpenRead (), Encoding. Default );
12 int temp = 0;
13 int count = 20;
14 while (count> 0)
15 {
16 Console. Write ("{0, 7: x}", br. ReadByte ());
17
18 if (++ temp = 4)
19 {
20 Console. WriteLine ();
21 temp = 0;
22}
23
24 count --;
25}
26 Console. WriteLine ();
27}
28}
As a result, except for the failure to completely output characters, the operation is normal. Therefore, the problem is concentrated on PeekChar. The above uses it to determine the boundary of the file. MSDN describes "the next available character, or"-1 "If no available characters exist or the stream cannot be searched ." That is to say, PeekChar () has a pre-read process in determining whether the boundary is reached. In combination with the above encoding problems, we can guess that, during pre-reading, the buffer overflow in the method is caused by improper encoding.
I also found another article on the Internet: Do not usePeekChar() Judge EOF. In this article, we only say that we should not use PeekChar to judge EOF, but use judgment conditions (br. baseStream. position <br. baseStream. length), but no detailed reasons are given.
Then, and found that foreigners are also discussing this problem: http://bytes.com/topic/visual-basic-net/answers/349779-binaryreader-peekchar-argumentexception-conversion-buffer-overflow
......
Continue to go deeper, there are two points to solve: 1, UTF-8 coding problem; 2, PeekChar work details.
Conclusion: Through the above series of practices, we have a rough understanding of the use of BinaryReader, which can be used properly to avoid errors. However, the fundamental problem cannot be solved.
Certificate ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
More in-depth tomorrow.