Solutions to VIM coding problems

Source: Internet
Author: User
Tags windows ssh client

Put a few videos you made in the previous two years. I am lazy. I plan to make eight videos and only make four videos.

Http://www.boobooke.com/v/bbk4407

Http://www.boobooke.com/v/bbk4414

Http://www.boobooke.com/v/bbk4415

Http://www.boobooke.com/v/bbk4416

This article describes some basic knowledge about the multi-byte encoding document (Chinese) edited by VIM. Note that gvim is not involved, and VIM on character terminals only.
Vim coding basics:
1. Three variables exist:
Encoding -- this option applies to buffered text (files you are editing), registers, VIM script files, and so on. You can set the 'encoding' option as the internal running mechanism of vim.
Fileencoding -- this option is the encoding type used by VIM when writing files.
Termencoding -- this option indicates the encoding type used to output to the client terminal (TERM.
2. default values of the three variables:
Encoding -- it is the same as the current locale of the system. Therefore, when editing files, consider the current locale; otherwise, there will be more to set.
Fileencoding -- Vim automatically identifies the encoding when the file is opened, and fileencoding is the recognized value. If it is null, the file is saved in the encoding format of encoding. If encoding is not modified, the value is the current locale of the system.
Termencoding -- the default value is null, that is, the output to the terminal does not undergo encoding conversion.
It can be seen that editing different encoding files requires not only the three variables, there are also three key points: the current locale and, file encoding and automatic encoding identification, and the encoding types used by the client to run Vim. These three key points affect the setting of the three variables.
If someone asks: why is there a garbled code when I use Vim to open a Chinese document?
The answer is uncertain. The reason has already been discussed above. It is normal to find out the three key points and the set values of the three variables, it is a coincidence that no garbled characters appear.
Let's take a look at the values of these three key points in common cases and the values of these three variables in this case:
1. locale-currently, most Linux systems use UTF-8 as the default locale, but it may not. For example, some systems use the Chinese locale
Zh_cn.gb18030. When locale is UTF-8, encoding will be set to UTF-8 after Vim is started. This is the best compatibility mode because
If UTF-8 is used for processing, no defect conversion can be performed regardless of the external storage encoding. Locale determines the encoding of data internally processed by VIM, that is, encoding.
2. file encoding and automatic encoding recognition-This involves various encoding rules and will not be detailed in detail. However, you need to understand that the file encoding type is not stored in the file, that is, there is no
Descriptive fields to record the encoding type of the document. Therefore, when editing a document, we must either know the encoding used for saving the document, or determine the encoding class through other means.
The other method is determined by some encoding code table features, such as the number of bytes occupied by each character, whether the ASCII value of each character is greater than a field to determine the encoding of the file
. This method is also used by VIM, which is the automatic encoding and recognition mechanism of vim. However, this mechanism is not possible because of the variety of encodings.
100% accurate. For our gb2312 encoding, because the Chinese character is composed of two characters with an acⅱ value higher than 127, it is impossible to encode the gb2312 file
The Latin1 Encoding Area is separated, so the automatic identification mechanism is unsuccessful for gb2312. It only recognizes the file as Latin1 encoding. This problem also occurs in GBK, big5
Superior. Therefore, when editing such documents, you need to manually set encoding and fileencoding. If the file is UTF-8 encoded, VIM can automatically identify the correct
Encoding.
3. The encoding type used by the client to run Vim is the same as the second one, which is also a key point that is hard to determine. The second key point is to read and write content from the file to the file.
The encoding used when Vim outputs the content to the terminal. If the encoding type is different from the encoding type of the data received by the terminal, then, garbled characters are generated. In
In a Linux local X environment, generally, the terminal considers that the encoding type of the received data is consistent with the locale type of the system. Therefore, you do not need to worry about the problem. However, if remote terminals are involved, for example
If you log on to the server through SSH, the problem may occur. For example, SSH from a system with a locale of gb2310 (called a client) to a system with a locale of UTF-8 (called Server
Server) and enable Vim to edit the document. Without any modification, the server returns UTF-8 data, but the client considers that the server returns gb2312 data according
When gb2312 is used to explain the data, it must be garbled. In this case, we need to set termencoding to gb2312 to solve this problem. This problem is more common than ours.
Windows
When a desktop machine remotely logs on to the server via SSH, the encoding conversion problem of different systems is involved. Therefore, it is highly related to Windows and SSH clients. In
There are two types of coding software in windows, one is the software written for Unicode encoding, the other is the ANSI software, that is, the program processes data directly using byte streams, not
Focus on encoding. The previous program can correctly display the multi-language on windows in any language, and the other program can only display the correct text on the System of the language. Pair
In these two types of programs, we need to treat them differently. Take the SSH client as an example. The putty we use is Unicode, while the secure CRT is ANSI.
Software. For the former, we need to correctly process Chinese characters. We only need to ensure that the encoding of VIM output to the terminal is UTF-8, that is, termencoding = UTF-8. But for the latter
Make sure that the default code page for Windows is cp936 (default for Chinese Windows), and The termencoding =
Cp936.
Finally, let's take a look at the typical situations and setting methods for handling Chinese documents:
1. The system locale is UTF-8 (many Linux systems use the default locale Format), and the edited documents are in gb2312 or GBK format (Windows notepad
The default storage format. Most editors save it as this format by default, so it is the most common). The terminal type is UTF-8 (that is, it is assumed that the client is a unicode software of the putty class)
After Vim opens the document, encoding = UTF-8 (determined by locale), fileencoding = Latin1 (caused by incorrect automatic encoding judgment mechanism), and termencoding = NULL (by default, term encoding is not required ), the file is garbled.
Solution 1: first, modify fileencoding to cp936 or EUC-CN (the two are the same, but they are called differently). Note that the correct method is not: Set
Fileencoding = cp936. This is only to save the file as cp936. The correct method is to re-load the file as: edit by encoding cp936.
++ ENC = cp936, which can be abbreviated as E ++ ENC = cp936.
Solution 2: temporarily change the locale environment running Vim by using lang = zh_cn Vim
To start Vim in abc.txt mode, encoding = EUC-CN (determined by locale) and fileencoding = NULL (file under locale
The automatic encoding function is not enabled, so fileencoding remains the same as the file encoding method, that is, EUC-CN), termencoding = NULL (default value, empty value, and so on ).
At this time, it is garbled because our SSH terminal considers the received data as UTF-8, but Vim sends the data as EUC-CN, so it is still incorrect. Run the following command:
Set termencoding = UTF-8: If the terminal data is output as UTF-8, the display is normal.
2. The scenario is basically the same as that of scenario 1, except that the SSH software used is secure CRT class ANSI software.
After Vim opens the document, encoding = UTF-8 (determined by locale), fileencoding = Latin1 (caused by incorrect automatic encoding judgment mechanism), and termencoding = NULL (no need to convert term encoding by default ), the file is garbled.
Solution 1: ensure that the default code page of the Windows machine running secure CRT is cp936, which is already set by default in Chinese Windows. Others are the same as solution 1 above, but we only need to add one step: Set termencoding = cp936
Solution 2: similar to solution 2 above, but the last step to modify termencoding is omitted. In this case, the minimum modification is required, as long as the locale is set to zh_cn.
Encoding = EUC-CN, fileencoding, and termencoding are both null, that is, the value of encoding.
Status.
It can be seen that understanding the three key points and the significance of the three parameters will greatly help the coding problem, and you will be able to process the document as you like in the future, not just for VIM, in other environments that require encoding and conversion, you can apply similar ideas to solve the problem.
Finally, we recommend a powerful windows SSH client-xshell, which has multiple tabs similar to secure CRT.
But the most convenient is that this tool also has the ability to change the term encoding, so that we do not need to adjust termencoding frequently, just switch in the SSH Software
Code. This is the most convenient SSH tool I have used. It is a commercial software, but there is no limit on the use of non-registered users, but after the 30-day trial period is exceeded, it will prompt registration every time it is started.
Any impact.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.