How to view and modify Oracle character sets

Source: Internet
Author: User
Tags character set table name oracle database

One, what is the Oracle character set

An Oracle character set is a collection of symbols that are interpreted as a byte of data, having a size and a mutual containment relationship. ORACLE's support for the national language architecture allows you to use localized languages to store, process, and retrieve data. It makes database Tools, error messages, sort orders, dates, times, currencies, numbers, and calendars automatically adapted to localized languages and platforms.

The most important parameter that affects the Oracle database character set is the Nls_lang parameter.

Its format is as follows Nls_lang = Language_territory.charset

It has three components (language, region, and character set), and each component controls the characteristics of the NLS subset.

which

Language: Specifies the language of the server message, which indicates whether the message is Chinese or English

Territory: Specifies the date and number format of the server.

Charset: Specifies the character set.

such as American _ AMERICA. Zhs16gbk

From the composition of Nls_lang we can see that the real impact of the database character set is actually the third part.

So the character set between the two databases as long as the third part of the same can be imported to export data, before the impact of only the hint information is Chinese or English.

Ii. related knowledge of character set:

2.1 Character Set

The essence is that according to a certain character coding scheme, a set of different numerical codes is given to a group of specific symbols. The earliest supported encoding scheme for Oracle databases is US7ASCII.

Oracle's character set naming follows the following naming rules

Languagebit sizeencoding

That is, the language bit-number encoding

For example, ZHS16GBK represents the use of GBK encoding format, 16-bit (two-byte) Simplified Chinese character set

2.2 Character encoding scheme

2.2.1 Single byte encoding

(1) Single-byte 7-bit character set, you can define 128 characters, the most commonly used character set is Us7ascii

(2) Single-byte 8-bit character set, can be defined 256 characters, suitable for most countries in Europe

For example: WE8ISO8859P1 (Western Europe, 8-bit, ISO-standard 8859P1 code)

2.2.2 Multibyte encoding

(1) Variable-length multi-byte coding

Some characters are represented in one byte, other characters in two or more characters, and variable-length multibyte encodings are often used in support of Asian languages, such as Japanese, Chinese, Hindi, etc.

For example: Al32utf8 (where Al stands for all, refers to all languages), zhs16cgb231280

(2) fixed-length multi-byte encoding

Each character uses a fixed-length byte encoding scheme, and currently Oracle's only supported fixed-length multi-byte encoding is AF16UTF16 and is used only for national character sets

2.2.3 Unicode encoding

Unicode is a single encoding scheme that covers all the known characters currently used worldwide, that is, Unicode provides a unique encoding for each character. UTF-16 is a Unicode 16-bit encoding, a fixed-length multi-byte encoding that represents a Unicode character in 2 bytes, and Af16utf16 is a UTF-16 coded character set.

UTF-8 is a Unicode 8-bit encoding, a variable-length multi-byte encoding that can represent a Unicode character in 1, 2, 3 bytes, Al32utf8,utf8, UTFE UTF-8 encoded character set

2.3 Character Set Super

When the encoded value of a character set (character set a) contains the encoded value of all another character set (character set B), and the two character sets have the same encoded value representing the same character, the character set A is the super of character set B, or the character set B is a subset of the character set A.

A subset-Super table (Subset-superset pairs) is available in official documentation for Oracle8i and oracle9i, for example: WE8ISO8859P1 is a subset of we8mswin1252. Because US7ASCII is the earliest Oracle database encoding format, there are many character sets that are us7ascii, such as WE8ISO8859P1, zhs16cgb231280, and ZHS16GBK are superset of US7ASCII.

2.4 Database Character Set (Oracle server-side character set)

The database character set is specified when the database is created and cannot normally be changed after it is created. When you create a database, you can specify the character set (CHARACTER set) and the national CHARACTER set.

2.4.1 Character Set

(1) Used to store type data such as Char, VARCHAR2, CLOB, long, etc.

(2) used to mark such as table name, column name and Plsql variable, etc.

(3) used to store SQL and Plsql program units, etc.

2.4.2 National Character Set:

(1) to store nchar, NVARCHAR2, NCLOB and other types of data

(2) The national character set is essentially an additional set of characters selected for Oracle, primarily for the purpose of enhancing the character processing capabilities of Oracle, since the nchar data type can provide support for the use of fixed-length multibyte encodings in Asia, while the database character set is not. The national character set is redefined in oracle9i and can only be selected in Af16utf16 and UTF8 in Unicode encoding, and the default value is Af16utf16

2.4.3 Query Character Set parameters

You can query the following data dictionaries or views to view character set settings

Nls_database_parameters, props$, v$nls_parameters

Nls_characterset represents the character set in the query result, Nls_nchar_characterset represents the national character set

2.4.4 Modify the database character set

As stated above, the database character set cannot be changed in principle after it is created. But there are 2 methods available.

1. If you need to modify the character set, you typically need to export the database data, rebuild the database, and then import the database data to convert it.

2. Modifying the character set through the ALTER DATABASE CHARACTER SET statement, but modifying the character set after the database is created is limited and the database character set can be modified only if the new character set is a superset of the current character set, for example, UTF8 is a us7ascii superset. Modify the database character set to use ALTER DB CHARACTER set UTF8.

2.5 Client Character Set (Nls_lang parameter)

2.5.1 Client Character Set meaning

The client character set defines how the client character data is encoded, and any character data originating from or destined for the client uses the client-defined character set encoding, and the client can be viewed as a variety of applications that can be directly connected to the database, such as Sqlplus,expimp. The client character set is set by setting the Nls_lang parameter.

2.5.2 Nls_lang parameter format

Nls_lang=language_territory.client Character Set

Language display Oracle messages, checksums, date naming

Territory: Specify default date, number, currency format

Client Character Set: Specifies the character set that the client will use

For example: Nls_lang=american_america. Us7ascii

American is the language, America is the region, Us7ascii is the client character set

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.