[Architecture, 13] oracle Character Set details, oracle Character Set
1. A simple understanding of character sets: tables corresponding to characters and encodings. When the software has a character set, it uses its own character set. If the software does not have a character set, it uses the OS character set.
Ii. Character Set usage: 1. oracle has two character sets: database character set and national character set. These two character sets are selected when you install the database. Database character set usage: (1) used to store CHAR, VARCHAR2, CLOB, LONG, and other types of data (2) used to mark table names, column names, and PL/SQL variables (3) used to store SQL, PL/SQL program units, and other national character sets: (1) used to store NCHAR, NVARCHAR2, NCLOB, and other types of data
2. view the character set of the database:
SQL> select * from nls_database_parameters;
Parameter value specified PARAMETER NLS_LANGUAGE specified PARAMETER $ NLS_ISO_CURRENCY variable., NLS_CHARACTERSET WE8ISO8859P1 -- database character set NLS_CALENDAR GREGORIANNLS_DATE_FORMAT DD-MON-RRNLS_DATE_LANGUAGE AMERICANNLS_SORT limit HH. MI. SSXFF AM
Parameter value limit NLS_TIMESTAMP_FORMAT DD-MON-RR HH. MI. SSXFF limit HH. MI. ssxff am limit DD-MON-RR HH. MI. ssxff am limit $ NLS_COMP limit AL16UTF16 -- National Character Set NLS_RDBMS_VERSION 10.2.0.1.0
20 rows selected.
Iii. Character Set naming:
The character set naming rules of Oracle follow the following naming rules: <Language> <bit size> <encoding> that is, <Language> <bit digits> <encoding> for example: ZHS16GBK indicates that it adopts the GBK encoding format and a 16-bit (two-byte) simplified Chinese character set. Common Character Set: the old Chinese Character Set commonly used by US7ASCII Americans zhs16cgb231280 is used only for Chinese users. AL32UTF8 is the latest unicode character set, which is more than the utf8 character set. It is generally selected as the database character set. The AF16UTF16 National Character Set selects the new Chinese Character Set AF16UTF16ZHS16GBK. ZHS16GBK is a superset of the zhs16cgb231280 character set, but not a strict superset. Utf8
2. View All oracle character sets
Select * from V $ NLS_VALID_VALUES;
3. Check the OS Character Set: linux: locale, locale-a windows: chcp
4. Client OS Character Set, NLS_LANG settings, Server OS character set, and Oracle database character set:
1. The client software sqlplus does not have a character set. It uses the character set of the operating system. 2. If the software has a character set, the operating system character set is invalid. Therefore, oracle does not use the character set of the operating system. 3. All Character Set conversions are performed on oracle. 4. Main process: Use sqlplus to input Chinese and use the operating system character set to encode Chinese. Then it is uploaded to oracle. When the character set of oracle is different from the character set of the client, oracle converts the encoding to a character, and then recodes the character set of the database to save it. 5. How Does oracle know the character set of the Client: oracle uses the NLS_LANG parameter to know the character set of the client. Set the NLS_LANG parameter on the client. set the character set of the Client: set NLS_LANG = american_america.zhs16gbk
6. How to set the character set:
①. Character Set of the client operating system: Chinese, UTF8. ②. oracle Character Set: generally specified during database creation. ③ Client NLS_LANG parameter settings: must be consistent with the character set of the client operating system.
5. NLS_LANG settings: must be consistent with the character set of the client operating system. format: NLS_LANG = <language >_< territory>. <client character set> Language: displays the oracle message, verification, and date name Territory: Specify the default date, number, currency, and other formats. Client character set: Specifies the character set that the client will use, for example: NLS_LANG = AMERICAN_AMERICA.US7ASCII AMERICAN is the language, AMERICA is the region, and US7ASCII is the client character set.
6. Example: Check the data character encoding # dump (name, 1016): indicates that the name column is directly displayed in hexadecimal notation, and 10 indicates that CharacterSet is displayed in the query result. Select id, name, dump (name, 1016) from t2; # view your character encoding SQL> select dump (' ', 1016) from dual; DUMP (' ', 1016) --------------------------------------------- Typ = 96 Len = 4 CharacterSet = ZHS16GBK: c4, e3, ba, c3
# SQL> select dump (' ', 16) from dual;
DUMP (' ', 16) ------------------------- Typ = 96 Len = 4: c4, e3, ba, c3
How to query the character set of Oracle
1. What is the oracle character set? The Oracle character set is a collection of symbols for the interpretation of byte data, which can be divided into sizes and inclusive. ORACLE supports the national language architecture, allowing you to store, process, and retrieve data in a localized language. It makes database tools, error messages, sorting order, date, time, currency, numbers, and calendar automatically adapt to localization languages and platforms. The most important parameter that affects the character set of oracle databases is the NLS_LANG parameter. The format is as follows: NLS_LANG = language_territory.charset it has three components (language, region, and Character Set), each of which controls the NLS subset features. Where: Language specifies the Language of the server message, territory specifies the date and digital format of the server, and charset specifies the character set. For example, the composition of AMERICAN _ AMERICA. ZHS16GBK from NLS_LANG shows that the true impact on the database character set is actually the third part. Therefore, if the character set between the two databases is the same as that in the third part, data can be imported and exported to each other. The preceding information is only prompted in Chinese or English. 2. Many people have encountered data import failures due to different character sets. This involves three character sets: one is the character set on the El server side, the other is the character set on the oracle client side, and the other is the dmp file character set. During data import, the three character sets must be consistent before the data can be correctly imported. 1. There are many ways to query the character set of the oracle server to find the character set of the oracle server. The intuitive query method is as follows: SQL> select userenv ('language') from dual; the result is as follows: AMERICAN _ AMERICA. ZHS16GBK
How to view the character set of the Oracle database, detailed operations, xx
Database Server Character Set select * from nls_database_parameters, which is derived from props $ and represents the character set of the database.
Client Character Set environment select * from nls_instance_parameters, which is from v $ parameter,
Indicates the character set setting of the client, which may be a parameter file, environment variable, or registry.
Select * from nls_session_parameters in the session Character Set environment, which is derived from v $ nls_parameters, indicating the session's own settings, which may be the session environment variable or the session is completed by alter session. If the session has no special settings, it will be consistent with nls_instance_parameters.
The character set of the client must be the same as that of the server to correctly display non-Ascii characters of the database. If multiple settings exist, alter session> environment variable> registry> parameter file
The character set must be consistent, but the language settings can be different. We recommend that you use English for language settings. If the character set is zhs16gbk, The nls_lang can be American_America.zhs16gbk.
Involves three character sets,
1. Character Set on the El server side;
2. Character Set of oracle client;
3. dmp file character set.
During data import, the three character sets must be consistent before the data can be correctly imported.
2.1 query character sets on oracle server
There are many ways to find the character set of the oracle server. The intuitive query method is as follows:
SQL> select userenv ('language') from dual;
USERENV ('language ')
----------------------------------------------------
SIMPLIFIED CHINESE_CHINA.ZHS16GBK
SQL> select userenv ('language') from dual;
AMERICAN _ AMERICA. ZHS16GBK
2.2 how to query the dmp file Character Set
The dmp file exported using oracle's exp tool also contains character set information. The 2nd and 3rd bytes of the dmp file record the character set of the dmp file. If the dmp file is not large, for example, only a few MB or dozens of MB, you can use UltraEdit to open it (in hexadecimal mode) and view the content of 2nd 3rd bytes, such as 0354, then, use the following SQL statement to find the corresponding character set:
SQL> select nls_charset_name (to_number ('20140901', 'xxxxx') from dual;
ZHS16GBK
If the dmp file is large, for example, 2 GB or above (this is also the most common case), you can use the following command (on a unix host) to open it slowly or completely ):
Cat exp. dmp | od-x | head-1 | awk '{print $2 $3}' | cut-c 3-6
Then, you can use the preceding SQL statement to obtain its character set.
2.3 query character sets of oracle client
On windows, it is the NLS_LANG of OracleHome in the registry. You can also set it in the dos window,
For example: set nls_lang = AMERICAN_AMERICA.ZHS16GBK
In this way, only the environment variables in this window are affected.
On the unix platform, the environment variable NLS _... the remaining full text>