Transcoding of text file formats in Centos

Source: Internet
Author: User
Tags server website

Transcoding of text file formats in Centos solves the problem that the encoding of all files in the directory of the server website is gb2312 recently. It is garbled to open the file directly with cat or vim, and the display on the webpage is also garbled, therefore, we need to solve the problem of how to transcode all files in the directory (including files in subfolders) from gb2312 to UTF-8. After a search on the Internet, three methods were used in total and finally the problem was solved. The first method is to use the vi editor to open the file to be transcoded.: Set fileencoding to view the current encoding format of the file.: Set fileencoding = UTF-8: this can transcode the current file to UTF-8. However, if it fails to be opened, it still displays garbled characters, not to mention the page browsed by the browser. In addition, even if transcoding is successful, there are so many text files in the site directory that it is impossible to open them one by one and set them one by one. The workload is huge. The second method is to use iconv. The iconv system is installed by default. Run the Shell code iconv-f gb2312-t UTF-8 abc.html to convert the abc.html encoding to UTF-8. In this way, the converted text is displayed on the terminal. You can also use the Shell code iconv-f gb2312-t UTF-8 abc.html-o abc.html to overwrite the original file with the transcoded file, after all, this is the final goal. Now we can transcode all text files in the entire directory (including subdirectories): Shell code find-type f-name "*. html "-exec iconv-f gb2312-t UTF-8 {}-o {} \; Use-exec to add each result of the find command to the {} of the iconv, conversion of css files is similar to that of javascript files. However, the tragedy is that many error prompts are output. It is probably an invalid input, that is, the characters in many html files are incorrect. An error occurs when iconv is used for transcoding. Probably not all html files are encoded in gb2312. Therefore, remove the-f gb2312 parameter to make it like this: Shell code find-type f-name "*. html "-exec iconv-t UTF-8 {}-o {} \; but unfortunately the error persists. This method does not work. The third method is enca. Centos silently did not install enca, first download and then install: Shell code wget http://pkgs.repoforge.org/enca/enca-1.10-1.el6.rf.x86_64.rpm installation: Shell code rpm-ivh enca-1.10-1.el6.rf.x86_64.rpm enca usage: enca-L zh_CN file # View file encoding format enca-L zh_CN-x UTF-8 file # convert file to utf8 encoding format enca-L zh_CN-x UTF-8 file1 file2 # After conversion save as a file2 file, if file1 is not overwritten, convert all text files in the directory to utf8 format: Shell code find-type f-name "*. html "-exec enca-L zh_CN-x UTF-8 {} \; Only one or two files show that the original file fails to be transcoded due to unknown formats, and other html files are transcoded successfully. Haha. The next step is to use the same method to transcode the files with the extension htm, css, and js. Ah, the problem can be solved.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.