Let's start by introducing the origins and differences between the two concepts of "carriage return" (carriage return, ' \ R ') and "line feed, ' \ n ') . Before the computer appeared, there was a gadget called a telex typewriter (teletype Model 33) that could play 10 characters per second. But it has a problem, that is, when the line is finished, it will take 0.2 seconds to hit two characters. If there are new characters coming in this 0.2 seconds, then this character will be lost. So, the developers think of a way to solve this problem, is to add two after each line to end the character. One is called " carriage return", which tells the typewriter to position the printhead at the left border , and the other is called "line break", telling the typewriter to move the paper down one line . This is the origin of "line break" and "carriage return", from their English name can also be seen in one or two.
Later, the computer invented, these two concepts are also like to the computer. At that time, memory was expensive, and some scientists thought it would be too wasteful to add two characters at the end of each line. So, there was a disagreement:
- Unix system, the end of each line only "< line >", that is "\ n";
- Inside the Windows system, each line ends with "< enter >< wrap >", or "\ r \ n";
- Mac system, the end of each line is "< Enter >", that is, "\ r".
A direct consequence of this is that if the file under the Unix/mac system is opened in Windows, all the text will be turned into one line, and if the files in Windows are opened under Unix/mac, a ^m symbol may appear at the end of each line. Some common escape characters are such as:
Note: In the Windows system, the ENTER key is used as a combination of \ r \ n, when we enter the return from the keyboard, the Windows system will take the Enter as \ r \ n to deal with, Unix system will only be treated as \ n, regardless of what system, can use \ N to mark the end of a line, just when programming we need to be aware that in the Windows system we will read the \ r character, we must distinguish \ r from normal input characters.
Windows and UNIX file formats are different, the problem is generally in the \ r \ n problem. The carriage return (CR) and line feed (LF) characters are used to denote the "next line". And the standard does not specify which one to use. There are three different ways to use it:
- Windows uses carriage return + newline (CR+LG) to indicate the next line (i.e., the so-called PC format)
- UNIX uses a newline character (LF) to represent the next line
- The Mac machine uses a carriage return (CR) to indicate the next line
When transferring files between different systems, the conversion of the format is involved.
Conversions between two file formats:
1, Unix-Windows: ' \ n ' \ r \ n '
while ((ch = fgetc (in))! = EOF)
{
if (ch = = ' \ n ')
Putchar (' \ R ');
Putchar (CH);
}
Just add a ' \ R ' character to the Unix file before the ' \ n ' is present .
2, Unix <- Windows:' \ n ' <-' \ r \ n '
From Windows to UNIX, it is not possible to simply remove the ' \ R ' from the file. This situation occurs in an impact printer because a carriage return symbol is sometimes embedded in the end of a line of text in a Windows file. Therefore, before the conversion to determine whether ' \ r ' and ' \ n ' appear at the same time. if it appears at the same time, remove the ' \ R ', if it does not appear at the same time, keep ' \ n '.
Cr_flag = 0; /* No CR encountered yet */
while ((ch = fgetc (in))! = EOF)
{
if (cr_flag && ch! = ' \ n ') {
/*this CR did not preceed LF */
Putchar (' \ R ');
}
if (! ( Cr_flag = (ch = = ' \ r ')))
Putchar (CH);
}
difference between carriage return and line break