View file encoding and modification code under Linux

Source: Internet
Author: User

View File Encoding
There are several ways to view file encodings in Linux:
1. The file encoding can be viewed directly in Vim
: Set fileencoding
You can display the file encoding format.
If you just want to see other encoded files or if you want to solve the problem of viewing files garbled with Vim, you can
Add the following to the ~/.VIMRC file:

Set Encoding=utf-8 fileencodings=ucs-bom,utf-8,cp936

This allows vim to automatically identify the file encoding (which automatically identifies the UTF-8 or GBK encoded file), in fact, in accordance with the fileencodings provided by the encoding list to try, if not found the appropriate encoding, the latin-1 (ASCII) encoding opened.
2. eNCA (if you do not have this command installed on your system, you can use sudo yum install-y eNCA installation) to view the file encoding
$ enca filename
Filename:universal Transformation Format 8 bits; UTF-8
CRLF Line Terminators
It is important to note that eNCA is not very good at identifying certain GBK encoded files and will appear when identified:
Unrecognized encoding

File Encoding Conversion
1. Convert file encoding directly into Vim, such as converting a file to Utf-8 format
: Set Fileencoding=utf-8

2. Enconv conversion file encoding, such as to convert a GBK encoded file into UTF-8 encoding, the operation is as follows
Enconv-l zh_cn-x UTF-8 filename

3. Iconv conversion, the ICONV command format is as follows:
Iconv-f ENCODING-T Encoding Inputfile
such as converting a UTF-8 encoded file into a GBK encoding.
Iconv-f gbk-t UTF-8 file1-o file2

To view the file Encoding files command
File Ip.txt  ip.txt:utf-8 Unicode text, with escape sequences
first, using the Iconv command to encode and convert the content encoding  conversion file The Iconv command is used to convert the encoding of the specified file, the default output to the standard output device, or the output file.  usage: iconv [Options ...] [File ...]  The following options are available:  input/output format specification:  -F,--from-code= name raw text encoding  -T,--to-code= name output encoding  information:  -L,--list enumerate all known character sets    Output Control:-  C ignores invalid characters from output-  o,--output=file output file  -S,--silent close warning  --verbose print progress information  -?,-- Help gives a list of the system  's--usage give a brief usage information-  V,--version Print program version number    example:  iconv-f utf-8-T gb2312 aaa.txt > Bbb.txt  This command reads the Aaa.txt file, converts from Utf-8 encoding to gb2312 encoding, and its output is directed to the Bbb.txt file.
Second, file name encoding conversion
Because now with Linux, the original files in Windows are encoded with GBK. So copy to Linux is garbled, file content can be used iconv to convert but a lot of Chinese filename or garbled, find a can convert the file name Encoding command, is CONVMV.
CONVMV command Detail parameters  such as
Convmv-f gbk-t UTF-8 *.mp3
However, this command does not convert directly, you can see the contrast before and after the conversion. If you want the conversion to be straight, add the parameter--notest
Convmv-f gbk-t UTF-8--notest *.mp3
The-f parameter indicates the encoding before the conversion, and-T is the converted encoding. Don't make a mistake about it. Otherwise, it may be garbled. There is one more parameter that is useful. IS-r This indicates that all subdirectories under the current directory are converted recursively.
* Need to install convmv-1.10-1.el5.noarch.rpm  
Third,  A better eNCA command-line tool, which not only intelligently identifies the encoding of files, but also supports batch conversions.  1. Installing the  $sudo apt-get install eNCA  2. View current file encoding  enca-l zh_cn ip.txt Simplified Chinese national     S Tandard; GB2312  surrounded by/intermixed with Non-text data  3. The conversion  command format is as follows  $enca-L Current language-x target encoding file name 
    For example, to convert all files in the current directory to Utf-8  enca-l zh_cn-x utf-8 *     Check the encoding of the file enca-l zh_cn file converts the   document encoding to "UTF-8" encoding en Ca-l zh_cn-x UTF-8 File if you do not want to overwrite the original         

View file encoding and modification code under Linux

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.