File processing between DOS, Windows, and Linux/Unix

Source: Internet
Author: User
First, how to convert files: DOS/Windows and Linux/Unix files have different line breaks and carriage return formats. DOS/Windows-based text files have a CR (carriage return) at the end of each line) and LF (line feed), while UNIX text only has one line feed. 1. move the Dos/Windows files to Li...

 

First, how to convert files:

The format of line breaks for DOS/Windows and Linux/Unix files is different. DOS/Windows-based text files have a CR (line breaks) and LF (line breaks) at the end of each line ), UNIX text only has one line feed.

1. move the files in Dos/Windows to Linux/Unix.

Although many programs do not care about CR/LF text files in DOS/Windows format, there are several programs that care about it-the most famous is bash. as long as you press enter, it will cause problems. The following sed calls convert DOS/Windows text to a trusted UNIX format:

$ Sed-e's/. $ // 'mydos.txt> myunix.txt

The script works very easily: the replacement rule expression matches the last character of a row, and the character is exactly the carriage return. We can replace it with an empty character to completely delete it from the output. If you use this script and notice that the last character of each line in the output has been deleted, you specify a text file that is already in UNIX format. So there is no need to do that!

2. move the Linux/UNIX text to the Windows system and use the following script to perform the required format conversion:

$ Sed-e's/$/\ r/'myunix.txt> mydos.txt

In this script, the '$' rule expression matches the end of the row, and '\ r' tells sed to insert a carriage return before it. Insert a carriage return before line feed. immediately, each line ends with CR/LF. Note that '\ r' is replaced with CR only when GNU sed 3.02.80 or later is used '.

 

Second, file encoding

1. view the file encoding.

You can use: set fileencoding in VI to view the encoding of the current file.

 

2. use the iconv command to encode and convert the file content

Usage: iconv [option...] [file...]

 

The following options are available:

 

Input/output format specifications:

-F, -- from-code = Name Original text encoding

-T, -- to-code = name output encoding

 

Example:

Iconv-f gb2312-t UTF-8 aaa.txt> bbb.txt

This command reads the aaa.txt file and converts it from gb2312 to utf-8, and the output is directed to the bbb.txt file.

 

3. file name encoding conversion

Convmv can convert the Chinese file name of GBK-encoded files in windows into UTF-8 encoding.

 

Convmv-f GBK-t UTF-8 *. mp3

However, this command does not convert directly. you can see the comparison before and after conversion. If you want to add the parameter -- notest to the direct conversion

 

Convmv-f GBK-t UTF-8 -- notest *. mp3

-F indicates the encoding before conversion, and-t indicates the encoding after conversion. -R indicates recursively converting all subdirectories in the current directory.

 

4. enca not only intelligently identifies file encoding, but also supports batch conversion.

1) view the current file encoding

Enca-L zh_CN ip.txt

Simplified Chinese National Standard; GB2312

Surrounded by/intermixed with non-text data

 

2) conversion

Command format:

$ Enca-L current language-x destination encoded file name

For example, convert all files in the current directory to UTF-8.

Enca-L zh_CN-x UTF-8 *

Enca-L zh_CN file Check file encoding

Enca-L zh_CN-x UTF-8 file to convert file encoding to "UTF-8" encoding

Enca-L zh_CN-x UTF-8 <file1> file2 can do this if you don't want to overwrite the original file.

 

Reference: http://www.2cto.com/ OS /201110/106727.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.