Linux This editor vim display utf-8 document garbled solution

Source: Internet
Author: User
Tags reset

In the Linux system operation, Vim is a text editor, in the use of vim, incredibly display utf-8 documents garbled, how to solve this situation? The following small series to introduce how to solve the Linux vim utf-8 document garbled problem, take a look at it.

  1. Introduction of relevant basic knowledge

In Vim, there are four encoding-related options: Fileencodings, fileencoding, encoding, and termencoding. In actual use, any one of the options error, will result in garbled. Therefore, each VIM user should be clear about the meaning of these four options. Now, let's look at the meaning and the effect of these four options in detail.

(1) Encoding

Encoding is the character encoding method used within VIM. When we set the encoding, all the buffer, register, and string in the script inside vim all use this encoding. When Vim is working, if the encoding is inconsistent with its internal code, it converts the encoding into an internal code. If the working encoding contains characters that cannot be converted to an internal encoding, these characters are lost. Therefore, when choosing the internal code of Vim, be sure to use a code that is strong enough to be able to affect normal work.

Because the encoding option involves an internal representation of all characters in Vim, it can only be set once when Vim is started. Modifying encoding during Vim's work can cause a lot of problems. The user's manual suggests changing its value only in. VIMRC, and in fact it seems only meaningful to change its value in. vimrc. If there are no special reasons, always set encoding to Utf-8. To avoid UTF-8 system such as Windows, menu and system prompts appear garbled, you can do these settings at the same time:

Set Encoding=utf-8

Set LANGMENU=ZH_CN. UTF-8

Language message ZH_CN. UTF-8

(2) termencoding

Termencoding is the code that Vim uses for on-screen display, and when displayed, VIM converts the internal code into a screen encoding, which is then used for output. When an internal encoding contains characters that cannot be converted to screen encoding, the character becomes a question mark, but does not affect its editing operations. If termencoding is not set, direct use of encoding does not convert.

For example, when you log on to the Linux workstation by Telnet under Windows, because Windows Telnet is GBK encoded, and Linux uses UTF-8 code, you will be garbled in the vim under Telnet. At this time there are two ways to eliminate garbled code: One is to change the encoding of vim to GBK, another way is to keep encoding for Utf-8, termencoding to GBK, let Vim in the display of the time transcoding. Obviously, when you use the previous method, these characters are lost when you encounter an edited file that contains characters that GBK cannot represent. However, if the latter method is used, these characters cannot be displayed because of terminal limitations, but these characters are not lost during the editing process.

For Gvim under the graphical interface, its display does not depend on term, so termencoding has no meaning for it. In the gvim of GTK2, termencoding is always utf-8 and cannot be modified. Gvim under Windows ignores the existence of termencoding.

(3) fileencoding

When vim reads a file from disk, the encoding of the file is probed. If the file is encoded in a way that is different from Vim's internal encoding, VIM converts the encoding. After the conversion is complete, vim sets the fileencoding option to the encoding of the file. When Vim is saved, if encoding and fileencoding are not the same, VIM will encode the conversion. Therefore, by setting fileencoding after opening the file, we can convert the file from one encoding to another. However, it can be seen from the previous introduction that fileencoding is automatically set when the file is opened and detected by VIM. Therefore, if there are garbled, we can not reset the file after opening the fileencoding to correct garbled.

In short, fileencoding is the character encoding of the file currently edited in Vim, and Vim saves the file as this character encoding (whether or not it is the case for new files).

(4) Fileencodings

The automatic identification of the encoding is realized by setting the Fileencodings, which is the plural form. Fileencodings is a comma-delimited list in which each item in the list is a coded name. When we open the file, Vim uses the encoding in fileencodings in order to try to decode it, and if it succeeds, use the encoding to decode it, set the fileencoding to this value, and if it fails, continue with the next code.

Therefore, when we set the fileencodings, we must put the strict requirements, when the file is not the encoding of the more prone to decoding failure of the encoding method put in front, the loose coding way behind. For example, Latin1 is a very loose encoding, any encoding of the text obtained by decoding with latin1, will not occur decoding failure-of course, the result of the decoding is naturally the "garbled". Therefore, if you put the latin1 into the first place of the fileencodings, open any Chinese files are garbled is a matter of course.

  The following is a recommended fileencodings setting on the Web:

Set Fileencodings=ucs-bom,utf-8,cp936,gb18030,big5,euc-jp,euc-kr,latin1

Among them, Ucs-bom is a very strict encoding, the encoding of the file is almost impossible to be misjudged as Ucs-bom, so put in the first place.

Utf-8 is also quite strict, except for a very short file (for example, many people relish the GBK encoded "Unicom" is mistaken for the UTF-8 encoded Classic error), the real life of the general document is almost impossible to be misjudged, so put in second place.

Next is cp936 and GB18030, these two codes are relatively loose, if put in front, there will be a lot of misjudgment, so let them lean back. CP936 's coding space is smaller than GB18030, so put cp936 in front of the GB18030.

As for Big5, EUC-JP and Euc-kr, they are about the same degree of rigor as the cp936, putting them behind them, and there is a certain amount of miscalculation when editing these coded files, but this is something that Vim's built-in coding detection mechanism has no way to solve. Since Chinese users rarely have the opportunity to edit these coded files, we decided to put cp936 and GB18030 in front of them to ensure that the code is identified.

The last is latin1. It's a very loose code so that we have to put it in the last one. Unfortunately, when you encounter a real latin1 encoded file, most of the time, it does not have the opportunity to fall-back to latin1, often in the previous code is misjudged. However, as mentioned earlier, Chinese users do not have much access to such documents.

If the code is misjudged, the result of decoding can not be human recognition, so we say, this file garbled. At this point, if you know the correct encoding of this file, you can open the file by using the ++enc=encoding method to open the file, such as:

: E ++enc=utf-8 myfile.txt

Above is the Linux solution vim display Utf-8 document garbled method introduced, appear this garbled problem, can by reset fileencodings to solve, hope to help you.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.