http://blog.csdn.net/jnbbwyth/article/details/6991425
View File Encoding
There are several ways to view file encodings in Linux:
1. The file encoding can be viewed directly in Vim
: Set fileencoding
You can display the file encoding format.
If you just want to see other encoded files or if you want to solve the problem of viewing files garbled with Vim, you can
Add the following to the ~/.VIMRC file:
Set Encoding=utf-8 fileencodings=ucs-bom,utf-8,cp936
This allows vim to automatically identify the file encoding (which automatically identifies the UTF-8 or GBK encoded file), in fact, in accordance with the fileencodings provided by the encoding list to try, if not found the appropriate encoding, the latin-1 (ASCII) encoding opened.
2. eNCA (if you do not have this command installed on your system, you can use sudo yum install-y eNCA installation) to view the file encoding
$ enca filename
Filename:universal Transformation Format 8 bits; UTF-8
CRLF Line Terminators
It is important to note that eNCA is not very good at identifying certain GBK encoded files and will appear when identified:
Unrecognized encoding
File Encoding Conversion
1. Convert file encoding directly into Vim, such as converting a file to Utf-8 format
: Set Fileencoding=utf-8
2. Enconv conversion file encoding, such as to convert a GBK encoded file into UTF-8 encoding, the operation is as follows
Enconv-l zh_cn-x UTF-8 filename
3. Iconv conversion, the ICONV command format is as follows:
Iconv-f ENCODING-T Encoding Inputfile
such as converting a UTF-8 encoded file into a GBK encoding.
Iconv-f gbk-t UTF-8 file1-o file2
To view the file Encoding files command
File Ip.txt ip.txt:utf-8 Unicode text, with escape sequences
first, using the Iconv command to encode and convert the content encoding conversion file The Iconv command is used to convert the encoding of the specified file, the default output to the standard output device, or the output file. usage: iconv [Options ...] [File ...] The following options are available: input/output format specification: -F,--from-code= name raw text encoding -T,--to-code= name output encoding information: -L,--list enumerate all known character sets Output Control:- C ignores invalid characters from output- o,--output=file output file -S,--silent close warning --verbose print progress information -?,-- Help gives a list of the system 's--usage give a brief usage information- V,--version Print program version number example: iconv-f utf-8-T gb2312 aaa.txt > Bbb.txt This command reads the Aaa.txt file, converts from Utf-8 encoding to gb2312 encoding, and its output is directed to the Bbb.txt file.
Second, file name encoding conversion
Because now with Linux, the original files in Windows are encoded with GBK. So copy to Linux is garbled, file content can be used iconv to convert but a lot of Chinese filename or garbled, find a can convert the file name Encoding command, is CONVMV.
CONVMV command Detail parameters such as
Convmv-f gbk-t UTF-8 *.mp3
However, this command does not convert directly, you can see the contrast before and after the conversion. If you want the conversion to be straight, add the parameter--notest
Convmv-f gbk-t UTF-8--notest *.mp3
The-f parameter indicates the encoding before the conversion, and-T is the converted encoding. Don't make a mistake about it. Otherwise, it may be garbled. There is one more parameter that is useful. IS-r This indicates that all subdirectories under the current directory are converted recursively.
* Need to install convmv-1.10-1.el5.noarch.rpm
Third, A better eNCA command-line tool, which not only intelligently identifies the encoding of files, but also supports batch conversions. 1. Installing the $sudo apt-get install eNCA 2. View current file encoding enca-l zh_cn ip.txt Simplified Chinese national S Tandard; GB2312 surrounded by/intermixed with Non-text data 3. The conversion command format is as follows $enca-L Current language-x target encoding file name
For example, to convert all files in the current directory to Utf-8 enca-l zh_cn-x utf-8 * Check the encoding of the file enca-l zh_cn file converts the document encoding to "UTF-8" encoding en Ca-l zh_cn-x UTF-8 File if you do not want to overwrite the original
View file encoding and modification code under Linux