Linux garbled characters: the secrets of LANG variables

Last Update:2014-05-19 Source: Internet

Author: User

Tags ssh secure shell

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Linux garbled text: LANG variable tips for Linux users in China, a common headache is that the system often displays garbled text when displaying Chinese characters. for some reason, when a system requires an English interface, the system cannot normally enter and display Chinese characters. in addition, due to the large... linux garbled text: LANG variable tips for Linux users in China, a common headache is that the system often displays garbled text when displaying Chinese characters. for some reason, when a system requires an English interface, the system cannot normally enter and display Chinese characters. in addition, because most major Linux distributions are dominated by English, the system and applications of the English interface are slightly better than those of Chinese in terms of appearance and stability of the interface, there are also fewer strange bugs. Therefore, many Linux users who have a basic English language prefer to use an English interface system. However, the contradiction emerges again: in the English system, how can we display and enter Chinese characters properly? Is there a perfect solution for both worlds? Therefore, I began to explore how to solve this problem. My perfect state is that the system and applications are all in English (System Menu, application toolbar, default input method, etc.), and when I need to read and write Chinese documents, display Chinese characters correctly and call up the Chinese input method. After a successful setup, we now use FC4 Linux as an example to explain some related knowledge and setup processes. This article mainly describes the general idea and process of modifying the linux system configuration by modifying the system configuration. if you are not patient, skip section 1-4 of the article and go directly to section 5 "quick setting. I. INTRODUCTION of related variables we know that most Linux systems do not have Chinese and English versions. taking FC4 Linux as an example, the system was released all over the world, whether the system is Chinese or English depends on the language pack you selected. When installing and using different countries, people in different countries select their own language packs. the language in the application is not completely written. it calls the relevant language based on the system settings, once an application is written, it can be used by users in different countries around the world on the mother tongue interface. this is called internationalization (International), or i18n for short. This is also the future trend of software development. If I have installed different language packs and different fonts on the system, how does the system determine the language interface I want and call the relevant fonts? What files and variables are under control in the system? In the redHat and fcseries Linux systems, the default language file used by the system is/etc/sysconfig/i18n. if the system is installed in Chinese by default, the content of i18n is as follows: Code: LANG = "zh_CN.UTF-8" SYSFONT = "latarcyrheb-sun16" SUPPORTED = "zh_CN.UTF-8: zh_CN: zh" where the LANG variable is short for language, A user with a basic English language can see that this variable determines the default language of the system, that is, the system menu, the tool bar language of the program, and the default language of the input method. SYSFONT is short for system font and determines which font is used by default. The SUPPORTED variable determines the language SUPPORTED by the system, that is, the language that the system can display. It should be noted that because the computer originated from the English-speaking country, no matter what you set these variables, English is always supported by default, and no matter what font you use, the English font is always included. Among these variables, LANG variables are used in the character mode and graphic interface. they are read and take effect after you log on to the system, I believe that many users often encounter garbled error messages when entering Linux commands on the character interface, you must install Chinese software in character mode such as zhcon or cce to display Chinese error messages normally. What should I do if I don't want him to show Chinese garbled characters or use zhcon specially to read a very simple error message? A simple zero-time solution is to set the LANG variable: Code: [root @ gucuiwen ~] # LANG = "en_US.UTF-8" is to set the system language to English temporarily, or a little simpler, you can directly like this: Code: [root @ gucuiwen ~] # LANG = "" refers to clearing the LANG variable. because English is supported in any situation, after the LANG variable is cleared, the system uses English by default. After this setting, the error messages output in character mode are all in English. However, this setting is temporary, but it only changes the bash variable LANG temporarily. It is invalid when you log out and log on again or switch to another character terminal. Now, the reader should think of, as long as the i18n file in the LANG variable set to English "en_US.UTF-8", you can solve this problem permanently. The modified file is as follows: code # LANG = "zh_CN.UTF-8" LANG = "en_US.UTF-8" SYSFONT = "latarcyrheb-sun16" SUPPORTED = "zh_CN.UTF-8: zh_CN: zh "Please do not simply empty the LANG variable, because this variable is used not only in character mode, but also in the graphic interface. it is no problem to simply clear it in character mode, however, in the graphic interface, Chinese characters cannot be displayed normally. in the past, the i18n file of the Re d ha t series also contains a LANGUAGE variable, the language settings under the graphic interface are specially controlled. now the fcseries has integrated these two variables into one variable. After modifying this variable and restarting the graphic interface, you can see that the interface is completely in English. However, you cannot call up the Chinese input method by pressing ctrl + Space. you cannot add a Chinese input method to the input menu. We simply modified the LANG variable to change the system language settings. of course, this step can also be modified using a tool in the graphic interface without modifying the configuration file. 2. the running level issue seems to have nothing to do with the topic of this article, but now more and more linux beginners are encountering issues with the linux GUI, and these problems are also involved in the Chinese input method setting process, so I 'd like to mention it by the way. After the current linux installation is complete, it runs at the 5th system running level by default. In the system v-style unix system, the SYSTEM is divided into different running levels, which is different from the UNIX of the BSD branch, commonly used as 0 ~ 6 seven levels: 0 shutdown, 1 single user, 2 multi-user without network, 3 multi-user with network, 4 reserved, users can customize the 5-user Gui. 6. restart the system. because the current Linux system runs at 5th levels after installation, that is, the system directly enters the graphic interface after startup, you do not need to use startx or xinit to start the GUI after logging on in character mode. This looks very convenient. But what are the disadvantages? Once you have changed some settings and displayed problems, the system keeps sending tokens between graphics and characters. it is very troublesome for new users to learn how to deal with them, and for those who study Linux, this is not conducive to understanding and learning some underlying Linux things. Old users who have used Linux for a long time know that in the past, Linux such as redhat6.0 had a default running node of 3. even if RedHat9.0 was later installed, they can select the default character logon or graphical logon. However, the current fcseries and most other versions directly allow users to directly log on to the GUI regardless of November 21. Although Linux is becoming easier for most Cainiao, there is a lot of fun and new users cannot experience it. Maybe you don't believe that there are a lot of problems when you log on to the system using graphics. Therefore, as a Linux and Solaris system administrator with 6 years of Linux experience, I strongly recommend that you set the default running level of the system to 3rd after the system is installed. after logging on to the character terminal, manually enter the startx command to start the graphic interface. You can use the following method to modify the/etc/inittab File: use a text editor to modify the code: id: 5: initdefault: line to the code: id: 3: initdefault: after saving, the system restarts and starts to the character interface by default. The difference between different running levels is that the system starts services by default. for example, Level 3 does not start the x gui service by default, but level 5 is started by default. There is no difference in nature, and it doesn't matter whether the functions of different levels are strong or not. Users can define different levels of default services on their own. At any running level, you can use the init command to switch to another running level. 3. call up the Chinese input method: the reason why I have to pay so much attention to the system operation level is that the understanding of the system starts from the bottom up. First, change the default running level to 3. of course, if you really don't want to modify it, temporarily use the init 3 command to switch to level 3rd. In this way, you can start the graphic interface with startx, and then exit the graphic interface with ctrl + alt + backspace. Note that I am talking about the "exit" graphical interface, instead of pressing ctrl + alt + F2 to switch to a character terminal. Okay, everything starts with startx. When you need to set something in a Linux system or configure a service, the most important thing is to know how it starts. You must know why. If you are free, set/etc/rc. the script that runs when the system is started in the d directory is read through, and you will be fully aware of what the configuration files under/etc are used for, how to modify, and what effect the modifications have. You can change the system as you like. This is what I have always stressed. you must know why. Be sure to go deep into the system, read the script, and learn to use commands and manually modify the system configuration file. In this way, you will have a thorough understanding of the system. using graphical interfaces all day long cannot help you to have a thorough and in-depth understanding of the system, different linux systems provide different GUI configuration programs, but the commands and configuration files are the same. the more underlying the program, the more universal it is. Therefore, you should first learn to manually configure and modify the system configuration file, and then use the graphical interface tool to modify it to reduce the workload. I mentioned my ideas for solving the problem above. I started with this idea: The Chinese input method is used in the graphic interface and runs a program in the graphic interface. Everything in the graphic interface is started and run by the startx program. This is the root cause of the problem. Locate the startx location: Code: [root @ gucuiwen ~] # Which startx/usr/X11R6/bin/startx check whether startx is a script or binary file: Code: [root @ gucuiwen ~] # File/usr/X11R6/bin/startx/usr/X11R6/bin/startx: Bourne shell script text executable found that startx is a shell script, so I opened it for analysis and reading, see if you can find some clues about the input method startup process and related variables: Code: [root @ gucuiwen ~] # Vi/usr/X11R6/bin/startx I found the information of other scripts and configuration files called by the script during running: Code: userclientrc = $ HOME /. xinitrc userserverrc = $ HOME /. xserverrc sysclientrc =/etc/X11/xinit/xinitrc sysserverrc =/etc/X11/xinit/xserverrc and know, startx aims to find the available desktop system X server system and user-defined parameters in the system, and finally calls xinit to initialize the X graphical interface. I did not find the code directly related to the start input method in the startx script, so I can be sure that the code related to the input method is in the script called by startx. So I went to the/etc/X11/xinit/directory and read and analyze the scripts in the directory. some of these scripts were directly called by startx, some scripts called by startx are called again, and there is a multi-level nesting relationship. it is hard to figure out without patience. In the end, I am at/etc/X11/xinit/xinitrc. xinput. the sh script finds the code related to the input method: lang_region =$ (echo $ tmplang | sed-e's /".. * // ') lang_region = "zh_CN" # This line is added after modification for f in $ HOME /. xinput. d/$ {lang_region} "$ HOME /. xinput. d/default "/etc/X11/xinit/xinput. d/$ {lang_region} "/etc/X11/xinit/xinput. d/default; do [-r $ f] & source $ f & break done through the analysis script, I know, when the GUI is started, the script determines whether to enable the input method based on the LANG variable and the language used for the input method. The problem is: we haven't changed the LANG variable to English, and the LANG variable obtained by the system is Chinese. Therefore, it knows that the Chinese input method needs to be enabled during the GUI startup process, however, after the LANG variable is changed to English, the system knows that the system is in English based on the LANG variable, so it does not start the Chinese input method or set or export the relevant variables, resulting in the unavailability of the Chinese input method. Therefore, as long as the "cheat" system in this script makes the input script "think" that the system is Chinese, does it run the Chinese input method and export the relevant variables? Therefore, through the analysis script. in sh: Code: lang_region =$ (echo $ tmplang | sed-e's /".. * // ') then lang_region = "zh_CN" is added to directly add lang_region = $ (echo $ tmplang | sed-e's /".. * // ') change to lang_region = "zh_CN". you can also add one more row for convenience. you can simply delete the added row. Of course, you can set/etc/X11/xinit/xinput. d/$ {lang_region} in the for loop to/etc/X11/xinit/xinput. d/zh _ CN. Of course there are other ways to change the syntax. The premise is that you must understand the shell script syntax and understand the meaning of the script. After this modification, even if the system is in English, xinput. the sh script also reads/etc/X11/xinit/xinput. d/zh _ CN file, export the content, set input variables such as XMODIFERS, and run the iiimx input method program. So why not directly run the iiimx input method program after the GUI is started? This is not the case. Because the input method program is a software that needs to run with the input application, many variables need to be exported during the operation. Directly running iiimx only runs the main program without relevant variables, and cannot work with the application to complete the input. After the modification is completed, save the script file. Enter the startx command to start the graphic interface. then you can use the system interface in full English and the Chinese input method. Note that the system is full of English, and the default input method is English, the application started from GNOME or KDE menu cannot be switched to Chinese by pressing ctrl + space when entering Chinese for the first time. you need to click the input method icon on the taskbar to switch, after the first switch, you can use the ctrl + space shortcut to switch between Chinese and English input methods. 4. some software related to subsequent problems, such as Open Office, cannot be started through GNOME or KDE, even if it is switched to the Chinese input method, because the entire desktop system environment is in English, software "inherit" the English environment related variables, these software is "identification", is not to allow the input of Chinese, this time can open a gnome Terminal, the LANG variable temporary set to zh_CN.UTF-8: [root @ gucuiwen ~] # LANG = "zh_CN.UTF-8" and then in this gnome Terminal, run the command to open office: [root @ gucuiwen ~] # Oowriter & in this way, the Open Office will "inherit" the LANG variable of the gnome Terminal. after starting, the toolbar and menu are both Chinese and can enter Chinese. By extension, any software can use this method to open the software on the Chinese interface and the software on the English interface as needed. To run software on the English interface, you only need to open the GNOME or KDE menu and modify the LANG variable in the terminal when running the software on the Chinese interface, run the command from the terminal that modified the LANG variable. Of course, if you have installed fonts in other languages, you can run programs on interfaces of other languages. For example, Japanese: [root @ gucuiwen ~] # LANG = "ja_JP.UTF-8" [root @ gucuiwen ~] # Gedit & the gedit editor opened with the above two commands is full Japanese interface, but can enter Chinese and English and display Japanese. In this way, a system, multiple languages and texts coexist. Of course, the premise is that the Japanese font and locale are installed, otherwise all the text will be displayed as a series of question marks. In short, you must first understand the principles and then enjoy the fun of using Linux. 5. quick setting steps: 1. modify the/etc/sysconf/i18n file and change LANG = "zh_CN.UTF-8" to: LANG = "en_US.UTF-8" 2. modify/etc/X11/xinit/xinitrc. d/xinput. sh file, put one line: lang_region =$ (echo $ tmplang | sed-e's /".. * // ') change to lang_region = "zh_CN" 3. restart the graphic interface to display Chinese characters and enter Chinese characters in English. -------------------------------------------------------------------- The following is a personal supplement: ssh terminal sometimes garbled when logging on to LINUX with ssh, even if you set LANG = en_US.UTF-8 is the same. I have tried SecureCRT, OpenSSH, and SSH Secure Shell clients. sometimes it is better to change the client settings, but sometimes you can only run a command once after changing the client settings, and then it becomes garbled. by chance, someone on the Internet said it would be okay to change the LANG variable value to "C. but it's still really evil. it's just LANG = C, and all the problems have been solved, and you don't have to set the client anymore. it's really inexplicable! But now I still don't know what the "C" means. it's so amazing.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More