Differences between text files and binary files

Source: Internet
Author: User

First, no file is strictly different from a text file or binary file. A file can be opened in either text or binary mode, the so-called text files and binary files are opened in different ways. However, in actual use, files that use ASCII codes and other understandable character sets are generally called text files and are usually opened in text format, such as plain text files (*. TXT), C source file, HTML hypertext, XML, etc. Other files are called binary files, such as Word file Doc and Image Format File JPG.

Second, the difference between text files and binary files is only for DOS and Windows systems, but not for UNIX and other operating systems (they are all binary files ).

So why do we need to distinguish between the two methods? This is because the two methods perform different operations when reading and writing files.

The binary method is very simple. When reading a file, it will read all the content of the file in an intact manner. When writing, it also writes the content of the memory buffer to the file.

The text format is different. When reading a file, all the line breaks "\ r \ n" (0x0d 0x0a) will be converted to the line break "\ n" (0x0a ), in addition, when the end character ctrlz (0x1a) is encountered, the file is deemed to have ended. Correspondingly, when writing a file, all "\ n" (0x0a) is replaced with "\ r \ n" (0x0d 0x0a ). Therefore, if you use text to open a binary file, it is easy to see incomplete file reading or wrong content. Even if you use text to open a text file, you should be cautious when using it. For example, if you copy a file, you should not use text.

As mentioned above, DOS and Windows systems use the double byte CRLF (0x0d 0x0a) as text file line breaks, while UNIX text files only have one byte LF (0x0a. In the C language, the line break is also lf or '\ n.

Because DOS/Windows-defined line breaks are inconsistent with those defined in C language, CRLF-> lf conversion is applicable when the standard input/output functions of C language are suitable for reading and writing text files. The definition of UNIX is the same as that of C.

Reference: http://hi.baidu.com/hyredsnow/blog/item/aa236a3af8999cd2d562258

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.