Linux character set and system language settings-lang,locale,lc_all,posix commands and Parameters

Source: Internet
Author: User
Tags i18n locale posix

The "preface" is explained in the blog post:

This article will introduce the Linux character set and system language settings through a personal tone, including lang,locale,lc_all,posix commands and the relevant knowledge of the parameters, at the current time point "June 21, 2017 " under the grasp of the technical level is limited, There may be a lot of knowledge understanding is not enough in-depth or comprehensive, I hope you point out the issue of common communication, in the follow-up work and learning if the contents of this article and the actual situation is biased, will improve the content of this blog post.


This article refers to a reference link:

1, http://blog.csdn.net/z4213489/article/details/7937894 "Good article, must see"
2, http://www.360doc.com/content/14/0103/13/10384031_342301450.shtml "The different format difference explanation comparison detailed, must see"


Body:


One: Character Set section


The character set is: 12 binary data <--to----the corresponding relational table of characters (also known as the corresponding relational database, corresponding to the relationship collection)


Example:

Suppose this is the UTF8 character Set correspondence:
----cases of 00010101
10101100----such as

Suppose this is the GBK character Set correspondence:
00010101----Chaos

10101100----yards


Explain:

We can see very clearly, the same piece of data, using different character sets, there will be different display results, this is the source of our garbled phenomenon, because different character sets organize data differently.


PS: When a browser presents a Web page, the content of the Web page uses a character set that is inconsistent with the system character set and is generally not garbled (the English Windows system can read the Chinese page normally), because the browser makes a judgment when it gets the Web file. Identify the character set used for Web page content, and then use the relative character set to display the page content, as long as you have these character sets in your operating system.

Supplemental Knowledge-Fonts:

font: text rendering effect, the same word, will have: the song body, bold, imitation, script, regular script, Microsoft ya Black and so on, the font is based on the character set. However, each font library may not contain all the fonts for a character set, and some times it will not be fully displayed.


Summarize:

Language is based on the character set, for example, our system uses the Chinese language, that is to say, the following points
1, write: We enter the Chinese characters to be able to be in the computer in the format of the Chinese character set to organize and save and transfer.
2, see: The network remote transmission of data (binary data), my computer received, I need to be able to display the font of Chinese characters (this is the corresponding relationship table)
In summary, in the Chinese environment is: to read Chinese, write Chinese (focus here), to preach Chinese.

Character set files in Linux storage location:/usr/share/i18n/charmaps


Second: System language setting-locale part


1. What is locale


In Linux, it uses the locale command to set and display the locale in which the program runs (that is, the system runtime locale, where the application process runs on top of the system's processes, and the application's parent process is the INIT process)

Locale in English literal translation is the place, region, area, but it is more meaning in Linux, in Linux locale according to the language used by computer users, the country or region, and local cultural traditions define a software runtime language environment.

the main role of locale is to describe the language habits and cultural traditions and habits of people in a particular region. the locale of an area is defined by the habits (variables) of several major classes.


2. Naming rules for locale


Locale naming rules:< languages >_< Regions >.< Character Set encoding ><@ correction values >
For example:
Zh_cn.utf8
Zh_cn.utf8,zh for Chinese, CN for Continental, UTF8 for character sets.

[Email protected]
De for German, de for Germany, UTF-8 for character set, euro for revision according to European customs

This command rule means that we use this format to assign values to variables when we set the relevant variables for the locale.



3. Locale command and parameter explanation


Setting the locale is essentially a set of variables (not including Lang and lc_all) that begin with a total of 12 LC units

Linux locale file storage location:/usr/share/i18n/locales


Example:

[Email protected]:~/.ssh> locale
Lang=zh_cn.utf8
Lc_ctype= "Zh_cn.utf8"
Lc_numeric= "Zh_cn.utf8"
Lc_time= "Zh_cn.utf8"
Lc_collate= "Zh_cn.utf8"
lc_monetary= "Zh_cn.utf8"
Lc_messages= "Zh_cn.utf8"
Lc_paper= "Zh_cn.utf8"
Lc_name= "Zh_cn.utf8"
Lc_address= "Zh_cn.utf8"
Lc_telephone= "Zh_cn.utf8"
Lc_measurement= "Zh_cn.utf8"
lc_identification= "Zh_cn.utf8"
Lc_all=


Explain:


LANG#Lang has the lowest priority, which is the default value for all lc_* variables。 Below all the variables starting with lc_ (excluding Lc_all), if there is a variable that does not have a variable value set, then the system will use Lang's variable value to assign the variable. If the variable has a value, it remains unchanged and is unaffected. As you can see, the value of the lc_* variable in the output in our example above is actually determined by the Lang variable.

lc_ctype#用于字符分类和字符串处理,control how all characters are handled, including character encoding, whether the character is single-byte or multibyte, how to print, and so on, this variable is the most important.
Lc_numeric #用于格式化非货币的数字显示.
Lc_time #用于格式化时间和日期.
Lc_collate #用于比较和排序.
Lc_monetory #用于格式化货币单位.
Lc_messages #用于控制程序输出时所使用的语言, mainly informational, error messages, status information, title, tags, buttons and menus.
Lc_paper #默认纸张尺寸大小
Lc_name #姓名书写方式
Lc_address #地址书写方式
Lc_telephone #电话号码书写方式
Lc_measurement #度量衡表达方式
Lc_identification #locale对自身包含信息的概述

Lc_all#它不是环境变量, it's a macro,all lc_* variables can be overwritten by the setting of the variable。 After this variable is set, you can revoke the setting value of the lc_* so that the values of these variables match the values of the lc_all, and note that the lang variable is not affected.

Macro: There may be some people on the macro does not have a concept, simply explained that we in the computer field of macro, is a description of batch processing. A macro is a process of processing data through a specified rule, which can be called a syntax substitution (you should match the replacement data in the editor, but this is a bit more complex here), where the macro is to get some kind of input (usually a string), and then how to do it based on predefined rules, Converted to the corresponding output (usually also a string). The real macro is more complicated than it is said here, and it is interesting to check the information yourself.
Over hereOur macro operation is to overwrite the value of the lc_* variable with the value of Lc_all.

Formatting: The above meaning of the format, may be unclear, formatting is to re-set the rules of the organization data, take our daily life example, we want to record a piece of data, we can be recorded on the grid paper, can be recorded on the horizontal grid paper, can be recorded on the white and so on, Here the paper format is a way to organize data, different formats, records and data volume and so on are not the same, we in the Windows system, often do the format of the flash drive operation is so, you can bring your own FAT32 format, reformat the format defined as NTFS, you can understand that, The record data of the USB stick is changed from grid paper to horizontal paper mode.

priority level: Lc_all>lc_*>lang

Note: Defining so many variables is useful in some cases, for example, when I need an English environment that can input Chinese, I can set Lc_ctype to ZH_CN. GB18030, and all the other entries are en_us. UTF-8.

Summary: lang is the default value of Lc_*, and Lc_all is higher than lc_*, and after the Lc_all is set, the value of lc_* is reset, and if you do not reset the value of Lc_all to NULL, you cannot set the value of lc_*

Add:In general, after we have installed the system, the value of our variable will be the following:
ntp-slave:~ # Locale
Lang=posix
Lc_ctype=en_us. UTF-8
Lc_numeric= "POSIX"
Lc_time= "POSIX"
Lc_collate= "POSIX"
lc_monetary= "POSIX"
Lc_messages= "POSIX"
Lc_paper= "POSIX"
Lc_name= "POSIX"
Lc_address= "POSIX"
Lc_telephone= "POSIX"
Lc_measurement= "POSIX"
lc_identification= "POSIX"
Lc_all=

Explanation: C is the default locale for the system, and POSIX is the alias of C, which is the standard C locale. The properties and behavior that it specifies are specified by the ISO C standard. When we install a new system, the default locale is C or POSIX.
The c we're talking about here is ASCII encoding.

POSIX: Portable Operating System Interface (Portable Operating system Interface of UNIX, abbreviated to POSIX), the POSIX standard defines the interface standards that the operating system should provide for applications, is the generic term for the series of API standards that IEEE defines for software to run on a variety of UNIX operating systems, formally referred to as IEEE 1003, and the International standard name is ISO/IEC 9945.
In other words, programs written for a POSIX-compliant operating system can be compiled on any other POSIX operating system, even from another vendor.
Summary: POSIX is a generic interface standard for UNIX-like systems, and programs developed based on this standard can be flexibly migrated to different versions of the system.

Here, POSIX in the locale is an industry-consistent default locale standard that is not geographically distinct and supported by all Linux distributions.


4. Common commands:

1. View current locale settings
# locale

ntp-slave:~ # Locale
Lang=zh_cn.utf8
Lc_ctype=en_us. UTF-8
Lc_numeric= "Zh_cn.utf8"
Lc_time= "Zh_cn.utf8"
Lc_collate= "Zh_cn.utf8"
lc_monetary= "Zh_cn.utf8"
Lc_messages= "Zh_cn.utf8"
Lc_paper= "Zh_cn.utf8"
Lc_name= "Zh_cn.utf8"
Lc_address= "Zh_cn.utf8"
Lc_telephone= "Zh_cn.utf8"
Lc_measurement= "Zh_cn.utf8"
lc_identification= "Zh_cn.utf8"
Lc_all=

2. View all available locales for the current system
# locale-a

3, set the system locale (here take Zh_cn.utf8 as an example)

1) Edit the file:/etc/profie, add the following at the end of the file and exit with an error
#vim/etc/profile
Export Lc_all=zh_cn.utf8
Export Lang=zh_cn.utf8
2) Execute the effective command:
#source/etc/profile



End:


Thank you for reading, I wish you a rewarding day, thank you!





This article is from the "Breeze Month Blog" blog, please be sure to keep this source http://watchmen.blog.51cto.com/6091957/1940609

Linux character set and system language settings-lang,locale,lc_all,posix commands and Parameters

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.