A preliminary discussion on character set problem (II.)

Source: Internet
Author: User
Tags character set log naming convention create database oracle database
Problem
A preliminary discussion on character set problem (II.)

--the character set of the database

Saturday, 2004-09-11 11:38 eygle

The original published in the Itpub Technology series "Oracle Database DBA topic Technology Pristine", without permission, is prohibited reproduced this article.
Original link:

Http://www.eygle.com/special/NLS_CHARACTER_SET_02.htm


2. Character set of the database


The character set is specified when the database is created, and is usually not changed after it is created, so it is particularly important to choose a correct character set when creating the database.
When creating a database, we can specify the character set (CHARACTER set) and the national CHARACTER set.
Character sets are used to store:
CHAR, VARCHAR2, CLOB, long, etc type data
Used to mark such things as table names, column names, and Pl/sql variables.
SQL and Pl/sql program units, etc.
The national character set is used to store:
NCHAR, NVARCHAR2, NCLOB and other types of data

These settings are specified when the database is created, and we can look at the creation script for the database:





Connect Sys/change_on_install as SYSDBA
Set echo on
Spool E:\oracle\ora92\assistants\dbca\logs\CreateDB.log
Startup Nomount pfile= "E:\oracle\admin\eygle Cripts\init.ora";
CREATE DATABASE Eygle
Maxinstances 1
Maxloghistory 1
Maxlogfiles 5
Maxlogmembers 3
Maxdatafiles 100
DataFile ' E:\oracle\oradata\eygle ystem01.dbf ' SIZE 250M reuse autoextend on NEXT 10240K MAXSIZE Unlimited
EXTENT MANAGEMENT Local
DEFAULT temporary tablespace TEMP tempfile ' E:\oracle\oradata\eygle\temp01.dbf ' SIZE 40M reuse Autoextend
On NEXT 640K MAXSIZE Unlimited
UNDO tablespace "UNDOTBS1" datafile ' E:\oracle\oradata\eygle\undotbs01.dbf ' SIZE 50M reuse Autoextend
On NEXT 5120K MAXSIZE Unlimited
CHARACTER SET ZHS16GBK
National CHARACTER SET AL16UTF16
LOGFILE GROUP 1 (' E:\oracle\oradata\eygle\redo01.log ') SIZE 10M,
GROUP 2 (' E:\oracle\oradata\eygle\redo02.log ') SIZE 10M,
GROUP 3 (' E:\oracle\oradata\eygle\redo03.log ') SIZE 10M;
Spool off
Exit





The above shown in bold is the set of character sets that are vital to us.

In the process of creating the database, select your character set at the following interface, for the Simplified Chinese platform, the default character set is: ZHS16GBK






Once your character set is selected, the characters that can be stored in the database are restricted, so the character set you choose should be able to hold all the characters you will use.


The common Chinese character sets are:




zhs16cgb231280 cgb2312-80 16-bit Simplified Chinese MB, ASCII
ZHS16GBK GBK 16-bit Simplified Chinese MB, ASCII, UDC


GB2312 code is the People's Republic of China's national character information exchange code, the full name of the information exchange with Chinese character coded character set-basic set, issued by the National Standards Bureau,
May 1, 1981, the implementation of the passage to the mainland. This code is also used in Singapore and other fields.
GBK coding is a guiding norm promulgated in the December 1995.
The GBK is compatible with the de facto internal code standard corresponding to the GB 2312-80 information Processing Interchange code, and supports ISO/IEC 10646-1 at the vocabulary level and
GB 13000-1 of all Chinese, Japanese and Korean (CJK) Kanji (20902 words). contains more coding.

But we say that ZHS16GBK is not a strict superset of zhs16cgb231280 (although the latter's characters exist in the former, but the same encoding in two different
Character sets may express different Chinese characters), so you still need special attention when doing database character conversions.



Oracle's character set naming follows the following naming rules:




<language><bit size><encoding>
namely: < language > < bit >< code >
For example: Zhs 16 · GBK



It should be explained that some character set names violate this specification, and the UTF-8 in Oracle8/oralce8i is the first character set to break the naming convention.
We can see that a class of character sets begins with Al, such as:
Al16utf16
Wherein Al represents all, which is applicable to all languages (all Languages), according to this standard UTF-8 should have been named Al24utf8.



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.