The difference between a byte and char of an Oracle varchar2 or char type

Source: Internet
Author: User
Tags postgresql



Oracle defines string types VARCHAR2 and char specify the length of the usage as follows:



VARCHAR2 (<SIZE> <byte| char>) <SIZE> is a number between 1~4000 that represents up to 4000 bytes of storage space.
char (<SIZE> <byte| char>) <SIZE> is a number between 1~2000 that represents up to 2000 bytes of storage space.
What's the difference between a byte and a char?
Byte, specified in bytes: VARCHAR2 (ten byte). This can support up to 10 bytes of data in a multibyte character set, which may be just two characters. When using multibyte character sets, bytes are not the same as characters.



Char, specified by character: VARCHAR2 (ten CHAR). This will support up to 10 characters of data, possibly up to 40 bytes of information. In addition, VARCHAR2 (4000 CHAR) theoretically supports up to 4,000 characters of data, but because the string data type in Oracle is limited to 4000 bytes, all 4,000 characters may not be available.


When using multibyte character sets such as UTF8, it is recommended that you use a CHAR modifier in the Varchar2/char definition, that is, to use VARCHAR2 (30 char) instead of VARCHAR2, Because you're probably meant to define a column that actually stores 30 characters of data. You can also modify the default behavior by using the session parameter or the system parameter Nls_length_semantics, which is to change the default setting of Byte to char. Modifying this setting at the system level is not recommended, and you should modify the session level with ALTER session. It is also important that the upper bound of bytes stored in the VARCHAR2 is 4000. However, even if you specify varchar (4000 CHAR), you may not be able to drop 4,000 characters in this field in fact, if all characters are represented by 4 bytes in the character set you choose, you can only drop 1000 characters in this field!
The following is a small example of the difference between a byte and a char, and shows the effect of an upper bound.
The test environment, 11.2.0.4, is done on the multibyte character set database, where character set Al32utf8 is used, which supports the latest version of the Unicode standard and encodes each character using 1~4 bytes in a variable-length manner
[Email protected]>col value for A30
[email protected]>col parameter for A30
[Email Protected]>select * from nls_database_parameters where parameter= ' nls_characterset ';

PARAMETER VALUE
------------------------------ ------------------------------
Nls_characterset Al32utf8
[Email protected]>show parameter Nls_leng

NAME TYPE VALUE
------------------------------------ --------------------------------- ------------------------------
Nls_length_semantics string BYTE


Create a test table



[Email protected]>create table T (a varchar2 (1), B varchar2 (1 char), c varchar2 (4000 char));

Table created.



Now, this table inserts a UTF character unistr (' \00d6 '), which is 2 bytes long, and the following results can be observed:
[Email protected]>select Length (unistr (' \ 00d6 '), LENGTHB (Unistr (' \00d6 ')) from dual; ,
  
LENGTH (Unistr (' \00d6 ')) LENGTHB (Unistr (' \ 00d6 ')  
----------------------------------------------- 
              1            2 
  
[Email protected]>insert into T (a) VALUES (unistr (' \00d6 '));  
insert into t (a) VALUES (Unistr (' \00d6 '))  
                          * 
ERROR at line 1: 
Ora-12899:value too L Arge for column "ZX". " T "." A "(Actual:2, Maximum:1)



This indicates that VARCHAR (1) is a unit of bytes instead of characters. There is really only one Unicode character here, but it does not fit in one byte; When you migrate an app from a single-byte fixed-width character set to a multibyte character set, you may find that the text that you can put down in the field now cannot be dropped. The 2nd reason is that in a single-byte character set, a string length of 20 characters is 20 bytes, which can be completely dropped in VARCHAR2 (20). However, in a multibyte character set, the length of 20 characters can reach 80 bytes (if each character is represented by 4 bytes), so that a Jay, 20 Unicode characters will probably not be able to drop in 20 bytes. You might consider modifying the DDL to VARCHAR2, or use the Nls_length_sementics session parameters mentioned earlier when you run the DDL creation table.
Inserting a field with one character observes the following results:
[Email protected]>insert into t (b) VALUES (UNISTR (' \00d6 '));  
   
1 row created. 
  
[Email protected]>col Dump for a30 
[email  Protected]>select Length (b), LENGTHB (b), dump (b) dump from t; 
  
 length (b) LENGTHB (b) dump 
-------------------------------------------------- 
    1    2 typ=1 len=2 : 195,150



This insert succeeds and you can see that the length of all inserted data is one character, and all string functions work in characters. The LENGTHB function (byte length) shows that this field occupies 2 bytes of storage space, and the dump function shows what the bytes are. This example shows that VARCHAR2 (n) does not necessarily store n characters, but simply stores n bytes.
Test VARCHAR2 (4000) below may not store 4,000 characters
[email protected]>declare
  2  l_date varchar2 (4000 char);  
  3  l_ch  varchar2 (1 char): = Unistr (' \00d6 ');  
  4  begin
  5  L_date:=rpad (l_ch,4000,l_ch);  
  6  insert into T (c) VALUES (l_date);  
  7   end; 
  8 / 
declare

ERROR at line 1: 
Ora-01461:can bind a Long value only for insert to a long column
Ora-06512:at line 6



It is shown here that a 4000 character is actually 8000 bytes long, so a string cannot be stored permanently in a varchar (4000 char) field, which can be placed in a PL/SQL variable because the pl/ The VARCHAR2 in SQL can reach 32K maximum. However, stored in the table, VARCHAR2 is rigidly limited to a maximum of 4000 bytes. We can successfully store 2000 of these characters:
[Email Protected]>declare
2 l_date varchar2 (4000 char);
3 L_ch varchar2 (1 char): = Unistr (' \00d6 ');
4 begin
5 L_date:=rpad (L_CH,2000,L_CH);
6 INSERT into T (c) VALUES (l_date);
7 End;
8/

PL/SQL procedure successfully completed.

[Email protected]>
[Email protected]>select length (c), LENGTHB (c) from T where C was not null;

LENGTH (c) LENGTHB (c)
---------- ----------
2000 4000









Original address: http://www.linuxidc.com/Linux/2017-01/139853.htm



The difference between a byte and char of an Oracle varchar2 or char type


Alibaba Cloud Hot Products

Elastic Compute Service (ECS) Dedicated Host (DDH) ApsaraDB RDS for MySQL (RDS) ApsaraDB for PolarDB(PolarDB) AnalyticDB for PostgreSQL (ADB for PG)
AnalyticDB for MySQL(ADB for MySQL) Data Transmission Service (DTS) Server Load Balancer (SLB) Global Accelerator (GA) Cloud Enterprise Network (CEN)
Object Storage Service (OSS) Content Delivery Network (CDN) Short Message Service (SMS) Container Service for Kubernetes (ACK) Data Lake Analytics (DLA)

ApsaraDB for Redis (Redis)

ApsaraDB for MongoDB (MongoDB) NAT Gateway VPN Gateway Cloud Firewall
Anti-DDoS Web Application Firewall (WAF) Log Service DataWorks MaxCompute
Elastic MapReduce (EMR) Elasticsearch

Alibaba Cloud Free Trail

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.