On the length of Chinese characters in Oracle database

Source: Internet
Author: User

Problem:

VARCHAR2 (4000) ABC;

Intert into table_name (ABC) VALUES (' Here are 1500 characters ... ');

Error: Insert character too long. After testing, found that a Chinese character accounted for 3 bytes, it will be an error.

The crux of the problem:

This error occurs when the character set used is UTF8.

You can use the command to view:

Sql> SELECT * from v$nls_parameters where parameter= ' nls_characterset ';

PARAMETER
------------------------------------------------
VALUE
------------------------------------------------
Nls_characterset
Al32utf8

Workaround:

It is recommended that you use the ZHS16GBK character set.

After you do this, the problem can be resolved.

Sql> SHUTDOWN IMMEDIATE;  




sql> ALTER DATABASE CHARACTER SET AL32UTF8/ZHS16GBK;
Sql> SHUTDOWN IMMEDIATE;
Sql> STARTUP;



Chinese characters occupy bytes in Oracle

One, the Chinese character occupies the number of bytes in Oracle

Be sure to choose a good character set when you are building a library, or it may cause problems for subsequent development or migration. Character set problems in development often result in character crossings or garbled problems at the application level.

Let's take a look at 2 databases that use different character sets:

First Look at Server1:

Sql> SELECT * from V$nls_parameters A;



PARAMETER VALUE

---------------------------------------------------------------- ----------------------------------------------- -----------------

Nls_language Simplified Chinese

Nls_territory

Nls_currency RMB

Nls_iso_currency

Nls_numeric_characters.,

Nls_calendar Gregorian

Nls_date_format DD-MON-RR

Nls_date_language Simplified Chinese

Nls_characterset WE8ISO8859P1

Nls_sort BINARY

Nls_time_format HH.MI. Ssxff AM

Nls_timestamp_format DD-MON-RR HH.MI. Ssxff AM

Nls_time_tz_format HH.MI. Ssxff AM TZR

Nls_timestamp_tz_format DD-MON-RR HH.MI. Ssxff AM TZR

Nls_dual_currency RMB

Nls_nchar_characterset AL16UTF16

Nls_comp BINARY

Nls_length_semantics BYTE

NLS_NCHAR_CONV_EXCP FALSE



Rows selected



sql> select ' Kanji ' from dual;



'。。 '

------





And look at Server2:

Sql> SELECT * from V$nls_parameters A;



PARAMETER VALUE

---------------------------------------------------------------- ----------------------------------------------- -----------------

Nls_language Simplified Chinese

Nls_territory

Nls_currency RMB

Nls_iso_currency

Nls_numeric_characters.,

Nls_calendar Gregorian

Nls_date_format DD-MON-RR

Nls_date_language Simplified Chinese

Nls_characterset Al32utf8

Nls_sort BINARY

Nls_time_format HH.MI. Ssxff AM

Nls_timestamp_format DD-MON-RR HH.MI. Ssxff AM

Nls_time_tz_format HH.MI. Ssxff AM TZR

Nls_timestamp_tz_format DD-MON-RR HH.MI. Ssxff AM TZR

Nls_dual_currency RMB

Nls_nchar_characterset AL16UTF16

Nls_comp BINARY

Nls_length_semantics BYTE

NLS_NCHAR_CONV_EXCP FALSE



Rows selected



sql> select ' Kanji ' from dual;



Characters

------

Chinese characters



By comparing the two databases above, they found that their nls_characterset were different, one was we8iso8859p1, and the other was Al32utf8. WE8ISO8859P1 is a single-byte 8-bit character set, and Al32utf8 is a variable-length multi-byte encoding.

The current client's character set is: Simplified Chinese_china. Zhs16gbk

WE8ISO8859P1 no encoding, generally speaking, there are Chinese characters should not use this character set, although the modification of the client and server side of the same can solve the garbled problem.

Questions about the number of bytes that Chinese characters occupy in Oracle:

In the WE8ISO8859P1 character set, one Chinese character takes up 1 bytes.

Under the Al32utf8 character set, one Chinese character takes up 3 bytes.

The ZHS16GBK is occupied by 2 bytes.

VARCHAR2,VARCHAR,NVARCHAR2 are variable-length character types, while char is fixed.

In terms of the vsize function, it's obvious:

Server1:

Sql> Select Vsize (' Han ') from dual;



Vsize ('. ')

-----------

1

Server2:

Sql> Select Vsize (' Han ') from dual;



Vsize (' Han ')

-----------

3

And in length, they are all 1:

Server1:

Sql> Select Length (' Han ') from dual;



LENGTH ('. ')

------------

1

Server2:

Sql> Select Length (' Han ') from dual;



LENGTH (' Han ')

------------

1



Here's a look at the comparison:

The following tables are established in both Server1 and Server2:

CREATE TABLE T_test_var

(

V_char2 char (2),

V_char3 char (3),

V_varchar22 VARCHAR2 (2),

V_varchar23 VARCHAR2 (3),

V_VARCHAR2 varchar (2),

V_VARCHAR3 varchar (3),

V_nvarchar22 nvarchar2 (2),

V_nvarchar23 NVARCHAR2 (3)

);

The test data is then inserted into two databases:

Server1:

sql> INSERT into T_test_var values (' I ', ' I ', ' I ', ' I ', ' I ', ' I ', ' I ', ' Me ');



1 row inserted

sql> INSERT into T_test_var values (' we ', ' we ', ' us ', ' we ', ' us ', ' us ', ' us ', ' us ');



1 row inserted

sql> INSERT into T_test_var values (' We are ', ' We are ', ' We are ', ' We are ', ' We are ', ' We are ', ' We are ', ' we are ');



INSERT into T_test_var values (' We are ', ' We are ', ' We are ', ' We are ', ' We are ', ' We are ', ' We are ', ' we Are ')



Ora-12899:value too large for column "TEST". T_test_var "." V_char2 "(Actual:3, Maximum:2)



Description Char uses a byte to store a Chinese character in Server1.

Modify the last INSERT statement so that the char type is not out of bounds:



sql> INSERT into T_test_var values (' we ', ' We are ', ' We are ', ' We are ', ' We are ', ' We are ', ' We are ', ' we are ');



INSERT into T_test_var values (' we ', ' We are ', ' We are ', ' We are ', ' We are ', ' We are ', ' We are ', ' we Are ')



Ora-12899:value too large for column "TEST". T_test_var "." V_varchar22 "(Actual:3, Maximum:2)



Sql>

It shows that the varchar2 type in server1 is also a byte to store a Chinese character, modified again:

sql> INSERT into T_test_var values (' we ', ' We are ', ' we ', ' We are ', ' We are ', ' We are ', ' We are ', ' we are ');



INSERT into T_test_var values (' we ', ' We are ', ' we ', ' We are ', ' We are ', ' We are ', ' We are ', ' we Are ')



Ora-12899:value too large for column "TEST". T_test_var "." V_varchar2 "(Actual:3, Maximum:2)

It means that varchar is also a Chinese character for a byte storage space, continue to modify:



sql> INSERT into T_test_var values (' we ', ' We are ', ' we ', ' We are ', ' we ', ' We are ', ' We are ', ' we are ');



INSERT into T_test_var values (' we ', ' We are ', ' we ', ' We are ', ' we ', ' We are ', ' We are ', ' we Are ')



Ora-12899:value too large for column "TEST". T_test_var "." V_nvarchar22 "(Actual:3, Maximum:2)

It means that NVARCHAR2 is also a Chinese character that occupies a byte of storage space.

This is because there is no encoding in the WE8ISO8859P1 character. So we get the result of the above experiment.



Let's look at Server2:

sql> INSERT into T_test_var values (' I ', ' I ', ' I ', ' I ', ' I ', ' I ', ' I ', ' Me ');



INSERT into T_test_var values (' I ', ' I ', ' Me ', ' Me ', ' Me ', ' Me ', ' Me ', ' me ')



Ora-01401:inserted value too large for column

sql> INSERT into T_test_var values (' we ', ' we ', ' us ', ' we ', ' us ', ' us ', ' us ', ' us ');



INSERT into T_test_var values (' we ', ' we ', ' us ', ' we ', ' us ', ' us ', ' us ', ' us ')



Ora-01401:inserted value too large for column

sql> INSERT into T_test_var values (' We are ', ' We are ', ' We are ', ' We are ', ' We are ', ' We are ', ' We are ', ' we are ');



INSERT into T_test_var values (' We are ', ' We are ', ' We are ', ' We are ', ' We are ', ' We are ', ' We are ', ' we Are ')



Ora-01401:inserted value too large for column



Sql> Select cast (' Han ' as char (1)) from dual;



Select cast (' Han ' as char (1)) from dual



Ora-25137:data value out of range



Sql> Select cast (' Han ' as char (2)) from dual;



Select cast (' Han ' as char (2)) from dual



Ora-25137:data value out of range



Sql> Select cast (' Han ' as char (3)) from dual;



CAST (' Han ' Aschar (3))

-------------------

Chinese

sql> INSERT INTO T_test_var (V_VARCHAR22) VALUES (' Han ');



Insert into T_test_var (V_VARCHAR22) VALUES (' Han ')



Ora-01401:inserted value too large for column

sql> INSERT INTO T_test_var (V_VARCHAR23) VALUES (' Han ');



1 row inserted


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.