Originally did not pay careful attention to C + + read and write files of binary mode and text mode, this time to eat a big loss. (Platform: Windows VS2012)
Bug appears:
Write a program A, generate a text file F saved locally, and then read this file in program B to calculate the MD5 value.
Upload the file to the server, and then use program B to download the file from the server to calculate the MD5 value, the magic found two times the calculation of the MD5 value is different, who changed the file??
Troubleshooting Issues:
1. First compare the generated file F and upload to the server file, found that the file copy process error-free, is the same file.
2. After downloading the file F in program B, save locally, find the file is inconsistent with the original file F, compared to the binary found that each line more than one \ R.
3. Suspect that the server before the transfer of the file format changes, with Wireshark capture packet, found that the file content and the server on the same file. So where does this more of the \ r come from, and the end of the line becomes \r\r\n.
4. Look at the file F, the end of the line is \ r \ n, and I remember when the file was generated by \ n as a newline character, tangled up after thinking of the file read and write mode, only remember the text and binary differences, do not remember the problem of line breaks.
5. After several tangled, read C + + Primer Plus after the Epiphany, are the default use of text mode read and write files: Windows, text mode will be \ n output to \ r \ n, read will also be \ r \ n to a \ n; so when program B reads file F and calculates MD5, is calculated in \ n. However, when downloaded from the server, the file is the end of the line \ r \ n, the direct calculation of MD5 will result in a different value. When the downloaded file is saved, the text mode that is still used will be \ r \ n turned into a \r\r\n, leading to the original unthinkable results.
Summarize:
The bug has taken a lot of time from appearing to investigating all aspects of the problem, or because the foundation is not solid, the key section of C + + primer Plus is copied down as a reminder.
When using binary file mode, when a program passes data from memory to a file (and vice versa), no hidden transformations occur, and the default text pattern is not. For example, for Windows text files, they use a combination of two characters (carriage return and newline) to represent newline characters, Mac text files use carriage returns to represent line breaks, and UNIX and Linux files use newline to represent line breaks. C + + is developed from a UNIX system, so a newline is also used to represent line breaks. To increase portability, Windows C + + programs automatically convert C + + newline characters to carriage returns and line breaks when writing a text-mode file, and the Mac C + + program converts a newline character to a carriage return when writing a file. These programs convert local line breaks to C + + mode when reading text files. For binary data, the text format causes problems because the bytes in the middle of the double value may have the same bit pattern as the line break's ASCII code. In addition, the detection method at the end of the file is also different. Therefore, the binary file mode should be used when saving data in binary format. ”
Subsequent validation:
Later wrote a small program to verify that you know, do not understand the words can be copied down to run, note is the Windows platform, the generated files can be used Wxhexeditor to view the binary form of viewing. Another humorous, the string type encoding without language may be different, such as JavaScript is UTF-16, and C + + default is ANSI, download down the same file to calculate MD5 value can be problematic.
1#include <iostream>2#include <fstream>3#include <string>4 using namespacestd;5 intMain ()6 {7 8 stringSTR1 ="hello!\n";9Ofstream Fout ("file1");//Default text modeTenFout <<str1; One fout.close (); A -Ifstream Fin ("file1"); - CharCH =0; the stringtemp; - if(Fin) { - while(Fin.Get(CH)) -Temp + =ch; +cout <<"read in file1 length:"<<temp.length () <<Endl; - fin.close (); + } A at stringTemp2; -Fin.open ("file1", ios::binary);//take \ n as a newline - getline (FIN, temp2); -cout <<"Binary mode getline the length of the read-in File1 (the end contains \\r):"<< temp2.length () <<Endl; - fin.close (); - inOfstream Fout2 ("file2"); -Fout2 <<"hello!\r\n"; to fout2.close (); + - stringTemp3; theFin.open ("file2"); * if(Fin) { $ getline (FIN, Temp3);Panax Notoginsengcout <<"the text pattern getline read into the length of the file2 (same as a \\r):"<< temp2.length () <<Endl; - } the + return 0; A}
by ascii0x03, 2015.9.25
"C + +" Use file read/write mode with caution: Enter (' \ R ') A tangled experience with the problem of line-wrapping (' \ n ')