Linux file transcoding operation __linux

Source: Internet
Author: User

Reference articles

http://blog.csdn.net/jnbbwyth/article/details/6991425/

Http://blog.chinaunix.net/uid-27050514-id-3721035.html


View File Encoding
There are several ways to view file encoding in Linux:
1. You can view the file encoding directly in VIM
: Set fileencoding
The file encoding format can be displayed.
If you just want to see other coded format files or want to solve the problem of using Vim to see the files garbled, you can
Add the following in the ~/.VIMRC file:

Set Encoding=utf-8 fileencodings=ucs-bom,utf-8,cp936

This allows vim to automatically recognize the file encoding (which can automatically identify UTF-8 or GBK encoded files), in fact, in accordance with the code list provided by Fileencodings, and if the appropriate encoding is not found, open with the latin-1 (ASCII) encoding.
2. eNCA (if this command is not installed in your system, you can use sudo yum install-y eNCA installation) to view the file encoding
$ enca filename
Filename:universal Transformation Format 8 bits; UTF-8
CRLF Line Terminators
It is important to note that eNCA some GBK encoded files are not well recognized and appear when identified:
Unrecognized encoding

File Encoding Conversion
1. Direct conversion file encoding in Vim, such as converting a file to Utf-8 format
: Set Fileencoding=utf-8

2. Enconv conversion file encoding, such as to convert a GBK encoded file into UTF-8 encoding, the following operations
Enconv-l zh_cn-x UTF-8 filename

3. Iconv conversion, the ICONV command format is as follows:
Iconv-f ENCODING-T Encoding Inputfile
For example, converting a UTF-8 encoded file into a GBK encoding
Iconv-futf-8-T GBK file1-o file2

View File Encoding Files command
File Ip.txt  ip.txt:utf-8 Unicode text, with escape sequences
First, the use of iconv command to encode the conversion file content encoding conversion  iconv command to convert the specified file encoding, the default output to the standard output device, can also specify the output file.  usage: iconv [Options ...] [File ...]  The following options are available:  input/output format specification:-  F,--from-code= name raw text encoding-  t,--to-code= name output encoding  information:-  L,--list enumerating all known character sets    Output Control:-  C ignores invalid characters from output-  o,--output=file output file  -S,--silent turn off warning  --verbose print progress information  -?,-- Help gives the system assistance list  --usage gives a brief usage information-  V,--version Print program version number    example:  iconv-f utf-8-T gb2312 aaa.txt > Bbb.txt  This command reads the Aaa.txt file, converts the Utf-8 encoding to gb2312 encoding, and directs the output to the Bbb.txt file.
Second, file name encoding conversion
Because now with Linux, the original files in Windows are GBK encoded. So copy to Linux is garbled, file content can use Iconv to convert but a lot of Chinese file name or garbled, find a can convert file name Encoding command, is CONVMV.
CONVMV command Detail parameters  such as
Convmv-f gbk-t UTF-8 *.mp3
But this command does not convert directly, you can see the contrast before and after the conversion. If you want a straight conversion to add a parameter--notest
Convmv-f gbk-t UTF-8--notest *.mp3
The-f argument indicates the encoding before conversion, T is the converted encoding. Don't make a mistake about this. Otherwise, it may be garbled. There is also a parameter that is useful. This is---------this represents recursively converting all subdirectories under the current directory.
* Need to install convmv-1.10-1.el5.noarch.rpm  
Third,  Better fool-type command-line tool eNCA, which not only intelligently identifies file encodings, but also supports batch conversions.  1. Installation
$sudo apt-get Install eNCA
2. View current file encoding
enca-l zh_cn ip.txt Simplified Chinese national     Standard ; GB2312  surrounded by/intermixed with Non-text data
3. Conversion  command format is as follows
$enca-L Current language-X Target encoded file name
For example, to convert all files in the current directory into Utf-8
enca-l zh_cn-x utf-8 *     
Check file encoding enca-l   zh_cn file 
   converts the file encoding to the "UTF-8" encoding enca-l zh_cn-x UTF-8 file
If you don't want to overwrite the original file.                         

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.