Today found a problem, a length set to 2000 bytes of the field, inserting a length of 1000 Chinese characters data, unexpectedly error.
A Chinese character accounts for two bytes, which is arguably exactly 2000 bytes. However, by looking at the log, it is found that the length of the inserted data is 3000 bytes.
Baidu a bit, the original is a different character set caused.
In general, the nls_characterset of a database is Al32utf8 or UTF8, that is, a Chinese character occupies three to four bytes. If Nls_characterset is ZHS16GBK, a character occupies two bytes.
And the company's operating environment is Al32utf8, so a Chinese character accounted for 3 of bytes.
1. Use the LENGTHB method to view the number of bytes in the current database for text characters
You can see that the current database kanji is accounted for 3 bytes.
2. VARCHAR2 type: For VARCHAR2 characters to be stored in a few bytes, to see the database used by the character set, such as GBK, Chinese characters will account for two bytes, 1 in English. If it is UTF-8, Chinese characters generally account for 3 bytes, and English is 1. Therefore, the definition of varchar2 length can be up to 4000 (4,000 English characters can be stored), but the actual length can be stored, depending on the character set.
NVARCHAR2 type: For NVARCHAR2 characters, all characters are stored in 2 bytes, so you can define up to 2000 of the length. At the same time, for pure English storage, then it will take more space (because English in the VARCHAR2 type as long as 1 bytes of space). But for the Chinese character storage is actually more suitable, this can guarantee the Chinese character storage Most (2000).
Here is an example:
First, create two tables, each with only one field, with a field length of tblvarchar2 of 20 and a type of VARCHAR2. TBLNVARCHAR2 has a field length of 20 and a type of nvarchar2.
Then insert Chinese with a length of 20.
The length of the visible nvarchar2 is indeed the maximum length inserted by the word RP, regardless of the character set itself.
Query Result:
Oracle Kanji takes up bytes in different character sets