Solve Vim Chinese garbled characters

Source: Internet
Author: User
Tags windows ssh client

In Windows, the default mode is GB encoding, while in VIM mode, the default mode is UTF-8 (gedit is also UTF-8 by default. Modified the configuration file so that Vim supports GB encoding.

$ Vim ~ /. Vimrc

Let & termencoding = & Encoding
Set fileencodings = UTF-8, GBK

$: WQ

Open VI again, and the display will be normal. If not, re-open a terminal and open VI again.


More detailed information:
Notes for editing different encoding files in VIM

This article describes some basic knowledge about the multi-byte encoding document (Chinese) edited by VIM. Note that gvim is not involved, and VIM on character terminals only.
Vim coding basics:
1. Three variables exist:
Encoding -- this option applies to buffered text (files you are editing), registers, VIM script files, and so on. You can set the 'encoding' option as the internal running mechanism of vim.
Fileencoding -- this option is the encoding type used by VIM when writing files.
Termencoding -- this option indicates the encoding type used to output to the client terminal (TERM.
2. default values of the three variables:
Encoding -- it is the same as the current locale of the system. Therefore, when editing files, consider the current locale; otherwise, there will be more to set.
Fileencoding -- Vim automatically identifies the encoding when the file is opened, and fileencoding is the recognized value. If it is null, the file is saved in the encoding format of encoding. If encoding is not modified, the value is the current locale of the system.
Termencoding -- the default value is null, that is, the output to the terminal does not undergo encoding conversion.
It can be seen that editing different encoding files requires not only the three variables, there are also three key points: the current locale and, file encoding and automatic encoding identification, and the encoding types used by the client to run Vim. These three key points affect the setting of the three variables.
If someone asks: why is there a garbled code when I use Vim to open a Chinese document?
The answer is uncertain. The reason has already been discussed above. It is normal to find out the three key points and the set values of the three variables, it is a coincidence that no garbled characters appear.
Let's take a look at the values of these three key points in common cases and the values of these three variables in this case:
1. locale-currently, most Linux systems use UTF-8 as the default locale, but it may not. For example, some systems use the Chinese locale zh_cn.gb18030. When locale is UTF-8, encoding is set to UTF-8 after Vim is started. This is the best compatibility mode because UTF-8 is used for internal processing, no defect conversion can be performed regardless of the external storage encoding. Locale determines the encoding of data internally processed by VIM, that is, encoding.
2. file encoding and automatic encoding recognition-This involves various encoding rules and will not be detailed in detail. However, you need to understand that the file encoding type is not stored in the file, that is, there is no descriptive field to record the encoding type of the document. Therefore, when editing a document, we must either know the encoding used for saving the document, or determine the encoding class through other means.
The other method is determined by some encoding code table features, such as the number of bytes occupied by each character, whether the ASCII value of each character is greater than a field to determine the encoding type of the file. This method is also used by VIM, which is the automatic encoding and recognition mechanism of vim. However, this mechanism is not 100% accurate because of the variety of encodings, and it is impossible for each encoding to have significant features for identification. Because our gb2312 encoding uses two Chinese characters with an acⅱ value higher than 127, it is impossible to separate the gb2312 encoding file from the Latin1 Encoding Area, therefore, the automatic identification mechanism is unsuccessful for gb2312. It only recognizes the file as Latin1 encoding. This problem also occurs in GBK, big5
Superior. Therefore, when editing such documents, you need to manually set encoding and fileencoding. If the file encoding is UTF-8, VIM can automatically identify the correct encoding.
3. The encoding type used by the client to run Vim is the same as the second one, which is also a key point that is hard to determine. The second key point determines the encoding used to read content from the file and write content to the file. This key point determines the encoding used when Vim outputs content to the terminal, if the encoding type is different from the encoding type of the data received by the terminal, garbled characters may occur. In a Linux local X environment, generally, the terminal considers that the encoding type of the received data is consistent with the locale type of the system. Therefore, you do not need to worry about the problem. However, if remote terminals are involved, for example
If you log on to the server through SSH, the problem may occur. For example, SSH is used from a system with a locale of gb2310 (called a client) to a system with a locale of UTF-8 (called a server) and VIM editing documentation is enabled. Without any modification, the data returned by the server is UTF-8, but the client considers that the data returned by the server is gb2312. According to gb2312, the data must be garbled, in this case, you need to set termencoding to gb2312 to solve this problem. This issue occurs even more when we remotely log on to the server through SSH on Windows desktop, which involves encoding conversion between different systems. Therefore, it is highly related to Windows and SSH clients. In
There are two types of coding software in windows, one is the software written for Unicode encoding, the other is the ANSI software, that is, the program processes data directly using byte streams, do not care about encoding. The previous program can correctly display the multi-language on windows in any language, and the other program can only display the correct text on the System of the language. For these two types of programs, we need to treat them differently. Take the SSH client as an example. The putty we use is Unicode, while the secure CRT is ANSI. For the former, we need to correctly process Chinese characters. We only need to ensure that the encoding of VIM output to the terminal is UTF-8, that is, termencoding = UTF-8. But for the latter
Make sure that the default code page for Windows is cp936 (default for Chinese Windows), and The termencoding = cp936 set for vim.
Finally, let's take a look at the typical situations and setting methods for handling Chinese documents:
1. The system locale is UTF-8 (the default locale Format for many Linux systems), and the edited documents are in gb2312 or GBK format (Windows notepad is saved by default, most editors are stored in this format by default, so the most common mode is terminal type UTF-8 (that is, assume that the client is a unicode software of the putty class)
After Vim opens the document, encoding = UTF-8 (determined by locale), fileencoding = Latin1 (caused by incorrect automatic encoding judgment mechanism), and termencoding = NULL (by default, term encoding is not required ), the file is garbled.
Solution 1: first, modify fileencoding to cp936 or EUC-CN (the two are the same, but they are called differently). Note that the correct method is not: Set fileencoding = cp936, this is just to save the file as cp936, the correct method is to re-load the file in the cp936 encoding mode: Edit ++ ENC = cp936, can be abbreviated as: E
++ ENC = cp936.
Solution 2: temporarily change the locale environment running Vim to start Vim in the format of lang = zh_cn Vim abc.txt. Then encoding = EUC-CN (determined by locale ), fileencoding = NULL (the file encoding automatic identification function is not enabled in locale, so fileencoding remains the same as the file encoding method, that is, EUC-CN), and termencoding = NULL (default value, null
At this time, it is garbled because our SSH terminal considers the received data as UTF-8, but Vim sends the data as EUC-CN, so it is still incorrect. In this case, run the following command: Set termencoding = UTF-8 to output the terminal data as UTF-8.
2. The scenario is basically the same as that of scenario 1, except that the SSH software used is secure CRT class ANSI software.
After Vim opens the document, encoding = UTF-8 (determined by locale), fileencoding = Latin1 (caused by incorrect automatic encoding judgment mechanism), and termencoding = NULL (no need to convert term encoding by default ), the file is garbled.
Solution 1: ensure that the default code page of the Windows machine running secure CRT is cp936, which is already set by default in Chinese Windows. Others are the same as solution 1 above, but we only need to add one step: Set termencoding = cp936
Solution 2: similar to solution 2 above, but the last step to modify termencoding is omitted. In this case, the least change is required, as long as Vim is enabled with locale as zh_cn, encoding = EUC-CN, fileencoding, and termencoding are both null, which is the value of encoding.
It can be seen that understanding the three key points and the significance of the three parameters will greatly help the coding problem, and you will be able to process the document as you like in the future, not just for VIM, in other environments that require encoding and conversion, you can apply similar ideas to solve the problem.

Finally, we recommend a powerful windows SSH client, xshell, which has the ability to use multiple-tab SSH windows like secure CRT, but the most convenient thing is that this tool also has the function of changing the term encoding, so that we don't need to adjust termencoding frequently, just switch the encoding in the SSH software, this is the most convenient SSH tool I have used. It is a commercial software, but there is no limit on the use of non-registered users, but after the 30-day trial period is exceeded, it will prompt registration every time it is started.
Any impact.




From:

Http://hi.baidu.com/denglish/item/66f7dc6f4b0ce8106895e634

Http://www.guizhu.net/knowledge/post/104.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.