Why does Vim ENC still garbled after opening the file? With a practical copy of vimrc

Source: Internet
Author: User
In vim, there are four encoding-related options: ''fileencoding'', ''fileencoding'', ''encoding'', and ''terencoding ''. In actual use, any option error may cause garbled characters. Therefore, each Vim user should clarify the meaning of these four options. The following describes in detail the meanings and functions of these four options.

 

* Encoding '''encoding'' is the internal character encoding method used by VIM. After ''encoding'' is set, all the buffer, registers, and strings in the script in VIM use this encoding. When Vim is working, if the encoding method is inconsistent with its internal encoding, it will first convert the encoding to the internal encoding. If the encoding used for work contains characters that cannot be converted to internal encoding, these characters will be lost. Therefore, when selecting the vim internal encoding, you must use an encoding with sufficient performance to avoid affecting normal operations. Because the ''encoding'' option involves the internal representation of all characters in Vim, it can only be set once when Vim is started. Modifying ''encoding'' during vim may cause many problems. If there is no special reason, always set ''encoding'' to ''utf-8 ''. To avoid garbled menus and system prompts in non-UTF-8 systems such as Windows, you can also make these settings: # code syntax = "Vim" <--- set encoding = UTF-8 set langmenu = zh_CN.UTF-8 language message zh_CN.UTF-8

 

* Termencoding ''termencoding ''is the encoding that Vim uses for screen display. During display, VIM converts the internal encoding to screen encoding before output. When the internal encoding contains a character that cannot be converted to screen encoding, the character becomes a question mark, but the editing operation is not affected. If ''termencoding ''is not set, use ''encoding'' directly without conversion. For example, when you log on to the Linux workstation via Telnet in windows, because Windows telnet is GBK encoded, and Linux uses UTF-8 encoding, garbled characters are displayed in VIM in Telnet. At this time, there are two ways to eliminate Garbled text: one is to change the 'encoding' of VIM to 'gbk '', another way is to keep ''encoding'' as ''utf-8 '', change ''termencoding'' To ''gbk', and enable Vim to transcode during display. Obviously, when using the previous method, if the edited file contains characters that cannot be expressed by GBK, these characters will be lost. However, if the last method is used, although these characters cannot be displayed due to terminal limitations, they will not be lost during editing. For gvim in the graphic interface, its display does not rely on the term, so ''termencoding ''is meaningless to it. In gvim under gtk2, ''termencoding ''is always ''utf-8'' and cannot be modified. Gvim in Windows ignores the existence of ''termencoding.

 

* Fileencoding: When Vim reads files from the disk, it detects the file encoding. If the file encoding method is different from the vim internal encoding method, VIM converts the encoding method. After the conversion, VIM sets the ''fileencoding'' option to the file encoding. When Vim stores disks, if ''encoding'' and ''fileencoding'' are different, VIM performs encoding conversion. Therefore, by opening the file and setting ''fileencoding'', we can convert the file from one encoding to another encoding. However, we can see from the previous introduction that ''fileencoding'' is automatically set when the file is opened and tested by VIM. Therefore, if garbled characters appear, we cannot correct the garbled characters by setting ''fileencoding'' after opening the file.

 

* The automatic identification of fileencodings encoding is implemented by setting fileencodings. Note that it is in the plural form. Fileencodings is a list separated by commas (,). Each item in the list is an encoded name. When we open the file, VIM uses the encoding in fileencodings in sequence to try decoding. If it succeeds, it uses this encoding method for decoding, set ''fileencoding'' to this value. If it fails, continue to test the next encoding. Therefore, when we set ''fileencodings '', we must put strictly required encoding methods that are more prone to decoding failures when the file is not encoded, put the loose encoding method at the end. For example, Latin1 is a very loose encoding method. The text obtained by any encoding method is decoded using Latin1 and will not fail to be decoded.-Of course, the decoded results are naturally "garbled ". Therefore, if you put ''latin1' in the first place of ''fileencodings '', it is a matter of course to open any Chinese file. The following is a set of ''fileencodings ''recommended by dian hu: # code syntax =" Vim "<--- set fileencodings = ucs-bom, UTF-8, cp936, gb18030, big5, EUC-JP, EUC-KR, Latin1 --- in which, the UCMS-Bom is a very strict encoding. Files other than this encoding are hardly mistaken for the UCMS-bom, therefore, it is placed first. UTF-8 is also quite strict, in addition to very short files (for example, many people relish the GBK encoding of the "Unicom" was misjudged as a classic error of UTF-8 encoding ), in real life, files are almost impossible to be misjudged, so they are placed in the second place. The following are cp936 and gb18030. These two types of codes are relatively loose. If we put them in front, there will be a lot of misjudgment, So let them back. The encoding space of cp936 is smaller than that of gb18030, so cp936 is placed before gb18030. As for big5, EUC-JP, and EUC-KR, they are strictly the same as cp936. Put them behind them and there will inevitably be a lot of misjudgment when editing these encoded files, but this is a problem that Vim's built-in encoding detection mechanism cannot solve. Since Chinese users rarely have the opportunity to edit these encoding files, we decided to ensure that cp936 and gb18030 are recognized. Finally, latin1. It is an extremely loose code, so we have to put it in the last place. Unfortunately, when you encounter a file with Latin1 encoding, in most cases, it does not have the opportunity to fall-back to Latin1, which is often mistaken in the previous encoding. However, as mentioned earlier, Chinese users do not have much access to such files. If the encoding is wrong, the decoded results won't be recognized by humans, so we can say that this file is garbled. If you know the correct encoding of this file, you can change ''fileencodings ''to only this encoding to prevent any fall-back attacks and re-open the file. * Fencview according to the previous introduction, we know that the recognition rate is very low through the built-in Vim encoding recognition mechanism, especially for simplified Chinese (GBK/gb18030) and traditional Chinese (big5) identification Between Japanese (EUC-JP) and Korean (EUC-KR. For common users, it is unrealistic to see the encoding method of a file with the naked eye. Therefore, Dian Hu strongly recommends the fencview plug-in developed by mbbill In the Shui Mu community. This plug-in uses word frequency statistics to identify and encode, with a very high accuracy rate. Click http://www.vim.org/scripts/script.php? Script_id = 1708 download.

 

Practical vimrc:

Set Ts = 4
Set Sw = 4
Set expandtab
Set nobackup
Colors desert
Syntax enable
Set tags =/home/Eric/access/8. rtsp_rtp/workcodes/tags
Source/usr/share/Vim/vim71/mswin. Vim
Set fileencodings = ucs-bom, UTF-8, cp936, gb18030, big5, EUC-JP, EUC-KR, Latin1
Set encoding = UTF-8

 

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.