File character set encoding conversion in Centos

Source: Internet
Author: User
1. install the conversion tool [root @ master/] # yuminstallconvmv2. view the Linux character set [root @ master/] # localeLANG = zh_CN.utf8LC_CTYPE = "zh_CN.utf8" LC_NUMERIC = "zh

1. install the conversion tool

[Root @ master/] # yum install convmv

2. view the Linux character set

[Root @ master/] # locale

LANG = zh_CN.utf8

LC_CTYPE = "zh_CN.utf8"

LC_NUMERIC = "zh_CN.utf8"

LC_TIME = "zh_CN.utf8"

LC_COLLATE = "zh_CN.utf8"

LC_MONETARY = "zh_CN.utf8"

LC_MESSAGES = "zh_CN.utf8"

LC_PAPER = "zh_CN.utf8"

LC_NAME = "zh_CN.utf8"

LC_ADDRESS = "zh_CN.utf8"

LC_TELEPHONE = "zh_CN.utf8"

LC_MEASUREMENT = "zh_CN.utf8"

LC_IDENTIFICATION = "zh_CN.utf8"

LC_ALL =

3. start conversion

[Root @ master/] # convmv -- notest -- nosmart-f utf8-t gb2312-r test

Explanation:

Directory for test conversion

-R: process subfolders

Utf8 --- previous encoding

Gb2312 --- subsequent encoding

Note: it is garbled to copy the local character set zh_CN.utf8 to windows, so it must be converted to gb2312 encoding.

 

 

Input/output format specifications:
-F, -- from-code = Name Original text encoding
-T, -- to-code = name output encoding

Information:
-L, -- list lists all known character sets
Example:
Iconv-f UTF-8-t gb2312 aaa.txt> bbb.txt
This command reads the aaa.txt file and converts it from the 8th to the gb2312th file, and the output is directed to the bbb.txt file.
There are three commands for viewing files:

The cat command function is used to display the content of the entire file. Therefore, it is often used together with the more command. the cat command also combines several files into one file. More Command: pause the screen when the page is full. press space to continue displaying the next screen or press the Q key to stop the screen. Less command: the usage of the less command is similar to that of the more command. It can also be used to browse files on more than one page. The difference is that the less command can not only display the file down by pressing the space key, but also use the up or down key to scroll the file. To end browsing, press the Q key at the less command prompt. In fact, besides the cat command, these three commands have the ability to merge files. the other functions are similar, but they differ in browsing habits and display methods.
View the file encoding command file test. sqltest. SQL: UTF-8 Unicode text, with escape sequences function description: identify the file type. Syntax: file [-beLvz] [-f
 <名称文件>
  
] [-M
  <魔法数字文件>
   
...] [File or directory...] Supplementary Note: through the file command, we can identify the file type. Parameter:-B indicates that the file name is not displayed when identifying results are listed. -C displays the instruction execution process in detail to facilitate troubleshooting or analyzing program execution. -F
   <名称文件>
    
Specify the name file. The file contains one or more file names. let the file identify these files in sequence in the format of one file name for each column. -L displays the category of the file to which the symbolic connection points. -M
    <魔法数字文件>
     
Specify the magic number file. -V displays the version information. -Z tries to interpret the content of the compressed file. 1. use the iconv command to encode and convert the file content. The iconv command is used to convert the encoding of a specified file. by default, it is output to the standard output device. you can also specify the output file. usage: iconv [option...] [File...] the following options are available: input/output format specifications:-f, -- from-code = Name Original text encoding-t, -- to-code = name output encoding information:-l, -- list lists all known character set output control:-c ignores invalid characters-o from the output, -- output = FILE output FILE-s, -- silent close warning -- verbose prints progress information -?, -- Help provides the system's help list -- usage provides brief usage information-V, -- version printing program version example: iconv-f UTF-8-t gb2312 aaa.txt> bbb.txt examples file. II. file name encoding and conversion are now in linux. the files in windows are all encoded using GBK. copy to linux is garbled, and the file content can be converted using iconv, but many Chinese file names are still garbled. find a command that can convert the file name encoding, that is, convmv. convmv command detailed parameters such as convmv-f GBK-t UTF-8 *. mp3, but this command does not convert directly. you can see the comparison before and after conversion. if you want to add the parameter -- notest convmv-f GBK-t UTF-8 -- notest * to the direct conversion *. the mp3-f parameter indicates the encoding before conversion, and The-t parameter indicates the encoding after conversion. do not make a mistake. otherwise it may be garbled. another parameter is very useful. -r indicates recursively converting all subdirectories in the current directory.
    
   
  
 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.