The database current character set is Al32utf8, if you intend to change the character set to ZHS16GBK, execute the following command:
"ALTER DATABASE national CHARACTER SET internal_use ZHS16GBK"
Can you achieve the desired goal?
Let's take a look at what happens at the database level by executing the above command.
In a database with a character set of Al32utf8, create a table that inserts some Chinese characters and some printable characters into two fields, respectively.
1. Create a table
CREATE TABLE TEST
(
ID number,--record No.
Name VARCHAR2 (,--) Inserts a randomly generated kanji string. The method of generation is shown in annex I.
Key VARCHAR2 (60) – Insert a randomly generated printable string
)
2. Insert 50,000 records
Declare
Leng number;
Begin
for x in 1: 5
Loop
for y in 1: 10000
Loop
leng:=ceil(Dbms_random.value (0,ten));
INSERT INTO test Values (Y,gen_hanzi (Leng), dbms_random.string (' P ', Leng));
End loop;
commit;
End loop;
End;
3. Randomly view some record values, and the encoding under that character set
Query the UTF8 code of Chinese character "Lei",
It is known that the database is currently encoding Chinese characters using UTF8 .
4. Database execution Conversion Character set command
After conversion, query the character set of the current database:
At the data dictionary level, the character set of the database has been Al32utf8 converted to ZHS16GBK.
The data dictionary level has been adjusted, so does the character encoding be adjusted according to the new character set?
As you can see, all dump values have not changed-that is, at the disk storage level, the information has not changed. Because the store is still using UTF8 encoding, but decoding on the use of GBK decoding, so, the Chinese characters are all decoded error (that is, not get the correct Chinese character symbol). For printable characters, the decoding error does not occur because the characters have the same encoded values in two different character sets.
From the above, if the target character set is not a superset of the original character set, because the original encoding has not changed, there is a high likelihood of decoding errors resulting in garbled behavior. On top of that, we see the encoded values of the relevant Chinese characters under the UTF8 character set, so how are these characters encoded under the GBK character set?
Insert the records for this table into a database that uses the ZHS16GBK character set to view the record values and the encoded values:
As you can see, the GBK character set is 2 bytes encoded for Chinese characters, while UTF8 is 3 bytes encoded for Chinese characters.
We have another problem--after modifying the character set, there is no change to the character encoding stored before the change, what about the characters stored later?
6. Inserting Chinese characters and printable characters into the database after modifying the character set
Insert 100 records to distinguish it from previous records, id>10000.
Declare
Leng number;
Begin
for y in 10000. 10100
Loop
leng:=ceil(Dbms_random.value (0,ten));
INSERT INTO test Values (Y,gen_hanzi (Leng), dbms_random.string (' P ', Leng));
End loop;
commit;
End;
View the inserted value, along with its encoding:
From the 7th record of the encoding length, you can initially determine its use of GBK encoded values. Let's check it out:
Conclusion:
1. The command that performs character set conversion modifies the data dictionary
2. The command to perform character set conversion is not re-encoded with the new character set for previously stored characters. Because if the target character set is not a superset of the original character set, there may be garbled characters (especially kanji) after the conversion.
3. After the command for character set conversion is executed, the characters entered later are encoded with the new character set.
Attachments--"stored procedures for generating random-length kanji strings"
Generate stored procedures for random-length kanji strings:
Create or replace function Gen_hanzi (max_length number) return VARCHAR2 as
Leng number;
Hanzi varchar2(1 char);
Hanzis varchar2(4000): =";
Begin
Leng:=ceil (Dbms_random.value (0, max_length));
for x in 1: Leng
Loop hanzi:=unistr (' \ '| | Trim(To_char (Ceil (Dbms_random.value (19968,40869)),' XXXX ' ));
hanzis:=hanzis| | Hanzi;
End loop;
return Hanzis;
End;
What happened to the Oracle conversion character set operation?