Oracle Data Types and storage methods (1)

Source: Internet
Author: User

Overview
The basic data types of oralce and their storage methods are comprehensively and deeply analyzed through instances. Based on ORACLE 10 Gb, this article introduces the new data types introduced by ORACLE 10 Gb. This gives you a new understanding of oracle data types. Reveal some unknown secrets and ignored blind spots. The characteristics of each data type are discussed from the perspective of practicality and optimization. Start your oracle journey here!
First character type
§ 1. 1 char
A fixed-length string is filled with spaces to reach its maximum length, up to 2000 bytes.
1. Create a test table test_char. With only one char column. Length: 10
SQL> create table test_char (colA char (10 ));
Table created
2. Insert some data into the table.
SQL> insert into test_char values ('A ');
1 row inserted
SQL> insert into test_char values ('A ');
1 row inserted
SQL> insert into test_char values ('aaa ');
1 row inserted
SQL> insert into test_char values ('aaa ');
1 row inserted
SQL> insert into test_char values ('aaaaaaaaaaa ');
1 row inserted
Note: A maximum of 10 bytes can be inserted. If no, an error is returned.
SQL> insert into test_char values ('aaaaaaaaaaa ');
Insert into test_char values ('aaaaaaaaaaa ')
ORA-12899: value too large for column "PUB_TEST". "TEST_CHAR". "COLA" (actual: 11, maximum: 10)
3. Use the dump function to view the internal storage structure of each row.
SQL> select colA, dump (colA) from test_char;
Cola dump (COLA)
------------------------------------------------------------------------------------------
A Typ = 96 Len = 10: 97,32, 32,32, 32,32, 32,32, 32,32
Aa Typ = 96 Len = 10: 97,97, 32, 32, 32, 32, 32, 32
Aaa Typ = 96 Len = 10: 97,97, 97,32, 32, 32, 32, 32
Aaaa Typ = 96 Len = 10: 97,97, 97,97, 32, 32, 32, 32
Aaaaaaaaaa Typ = 96 Len = 10: 97,97, 97,97, 97,97, 97,97, 97,97
Note: Typ = 96 indicates the Data Type ID. Oracle numbers each data type. It indicates that the number of the char type is 96.
Len = 10 indicates the length of the internal storage (in bytes ). Although the first example only contains one character 'a', it still occupies 10 bytes of space.
97,32, 32,32, 32,32, 32,32, 32,32 indicates the internal storage mode. It can be seen that the internal storage of oracle is stored in the database character set.
97 is the ASCII code of Character.
You can use the chr function to convert ASCII code into characters.
SQL> select chr (97) from dual;
CHR (97)
-------
A
To know the ASCII code of a character, you can use the ascii function.
SQL> select ascii ('A') from dual;
ASCII ('A ')
----------
97
32 is the ascii value of space.
Char is a fixed-length type. It is always filled with spaces to reach a fixed width.
Using the char type will waste storage space.
The length unit of Oracle data type is byte.
SQL> select dump ('hang') from dual;
DUMP ('han ')
---------------------
Typ = 96 lename = 2: 186,186
It can be seen that a Chinese character occupies two bytes in oracle.
English letters or symbols only occupy one byte.
Char (10) can store up to 5 Chinese characters.
§ 1. 2 varchar2
Is a variable-length character type. Up to 4000 bytes of storage space can be occupied.
1. Create a table with only one column. The data type is varchar2 and the length is 10.
SQL> create table test_varchar (col varchar2 (10 ));
Table created
2. Insert some data
SQL> insert into test_varchar values ('A ');
1 row inserted
SQL> insert into test_varchar values ('A ');
1 row inserted
SQL> insert into test_varchar values ('aaa ');
1 row inserted
SQL> insert into test_varchar values ('aaaaaaaaa ');
1 row inserted
SQL> insert into test_varchar values ('aaaaaaaaaaa ');
2. Use the dump function to view the internal storage structure of each row.
SQL> select col, dump (col) from test_varchar;
Col dump (COL)
------------------------------------------------------------------------------------------
A Typ = 1 Len = 1: 97
Aa Typ = 1 Len = 2: 97,97
Aaa Typ = 1 Len = 3: 97,97, 97
Aaaaaaaaaa Typ = 1 Len = 10: 97,97, 97,97, 97,97, 97,97, 97,97
Typ = 1, indicating that the varchar2 type in oracle is 1
Len represents the number of bytes occupied by each row of data.
The specific storage value follows.
It can be seen that varchar2 occupies as much space as it saves. Space-saving. It will not be filled with spaces like char.
§ 1. 3 byte and char
You can specify the unit when defining the width of a character type in 10 Gb.
Byte is the Byte.
Char is a character.
Varchar2 (10 byte) is 10 bytes in length.
Varchar2 (10 char) is the length of 10 characters.
Char (10 bytes) is 10 bytes in length.
Char (10 char) is the length of 10 characters.
The number of bytes a character occupies is determined by the character set used by the current system.
For example, a Chinese character occupies two bytes.
View the character set used by the current system
SQL> select * from nls_database_parameters where parameter = 'nls _ CHARACTERSET ';
PARAMETER VALUE
--------------------------------------------------------------------------------------------------------------
NLS_CHARACTERSET ZHS16GBK
If you do not specify the unit when defining the type. Byte is used by default, that is, byte.
The advantage of using char is that the multi-byte character set is used.
For example, a Chinese character occupies two bytes in the ZHS16GBK character set.
Define the length of a column in a data table as 10 Chinese characters.
Create table test_varchar (col_char varchar2 (10 char ));
This is relatively simple. Pay attention to database table design.
Continue the experiment and create a new table with two columns. A column uses byte as the unit, and a column uses char as the unit
SQL> create table test_varchar2 (col_char varchar2 (10 char), col_byte varchar2 (10 byte ));
Table created
The Col_char column is defined as a string of 10 characters.
The Col_byte column is defined as a string of 10 bytes.
The current system uses the character set ZHS16GBK. Therefore, one character occupies two bytes.
Try to insert some data into the table
SQL> insert into test_varchar2 values ('A', 'A ');
1 row inserted
SQL> insert into test_varchar2 values ('yuan ', 'A ');
1 row inserted
SQL> insert into test_varchar2 values ('yuan Yuan ', 'aaaaaaaaaaa ');
1 row inserted
SQL> insert into test_varchar2 values ('yuan Yuan ', 'yuan Yuan ');
Insert into test_varchar2 values ('yuan Yuan ', 'yuan Yuan ')
ORA-12899: value too large for column "PUB_TEST". "maid". "COL_BYTE" (actual: 20, maximum: 10)
For the first time, the character a is inserted in both columns.
The second time, insert the character 'yuan 'in the col_char column and insert the character a in col_byte.
For the third time, insert 10 Chinese characters 'yuan 'in the col_char column and 10 characters a in col_byte.
The fourth time, when the Chinese character 'yuan 'is inserted in both columns, an error is returned. The second column is not long enough.
Let's look at the storage structure of each row.
SQL> select col_char, dump (col_char) from test_varchar2;
COL_CHAR DUMP (COL_CHAR)
----------------------------------------------------------------------------------------------------
A Typ = 1 Len = 1: 97
Yuan Typ = 1 Len = 2: 212,172
Yuan Typ = 1 Len = 20: 212,172,212,172,212,172,212,172,212,172,212,172,212,172,212,172, 21
When we insert 10 Chinese characters in the col_char column, its length is 20.
Although we use varchar2 (10, char) in the definition ).
Therefore, oracle determines the number of bytes occupied by this field based on the character set used by the database.
In this example, varchar2 (10, char) is equivalent to varchar2 (20 ).
Don't believe it. Let's try it.
SQL> desc test_varchar2;
Name Type Nullable Default Comments
-------------------------------------------
COL_CHAR VARCHAR2 (20) Y
COL_BYTE VARCHAR2 (10) Y
When multi-byte character sets are used, it is better to specify the Field Length in units of char. This prevents the length of a field.
You can use the lengthb function when you do not know how many bytes a character occupies in the current database.
SQL> select lengthb ('yuan ') from dual;
LENGTHB ('yuan ')
-------------
2
§ 1. 4 char or varchar
1. Create a new table. Column 1 is char type and column 1 is varchar2 type.
SQL> create table test_char_varchar (char_col char (20), varchar_col varchar2 (20 ));
Table created
2. Insert related data into both columns of the table
SQL> insert into test_char_varchar values ('Hello world', 'Hello World ');
1 row inserted
SQL> select * from test_char_varchar;
CHAR_COL VARCHAR_COL
----------------------------------------
Hello World
3. query by char_col as a condition
SQL> select * from test_char_varchar where char_col = 'Hello world ';
CHAR_COL VARCHAR_COL
----------------------------------------
Hello World
4. Use the varchar_col column as a condition Query
SQL> select * from test_char_varchar where varchar_col = 'Hello world ';
CHAR_COL VARCHAR_COL
----------------------------------------
Hello World
5. It seems that the char and varchar types are no different. Let's look at the following statement.
SQL> select * from test_char_varchar where varchar_col = char_col;
CHAR_COL VARCHAR_COL
----------------------------------------
This shows that they are not the same, which involves the problem of string comparison.
Because implicit conversion has occurred, the content of the char_col column has been converted to char (20) when compared with char_col of the char column. After Hello World, it is filled with spaces. The varchar_col column does not undergo this conversion.
If you want to make the char_col column equal to the varchar_col column. There are two methods.
First, use trim to remove spaces in the char_col column.
The second type is: enable remote rpad to fill the varchar_col column with spaces with 20 characters in length.
SQL> select * from test_char_varchar where trim (char_col) = varchar_col;
CHAR_COL VARCHAR_COL
----------------------------------------
Hello World
SQL> select * from test_char_varchar where char_col = rpad (varchar_col, 20 );
CHAR_COL VARCHAR_COL
----------------------------------------
Hello World
If you use the trim function, if the char_col column has an index, the index will be unavailable.
In addition, problems may occur when variables are bound.

§ 1. 5 NCHAR and NVARCHAR2
If the system needs to centrally manage and store multiple character sets, you need to use these two character types. When NCAHR and NVARCHAR2 are used, the text content is stored and managed using the national character set. Instead of the default character set.
The two types of length refer to the number of characters, not the number of bytes.
NLS National Language Support)
In oracle 9i and later versions, the database's national character set can be: UTF-8 and AL16UTF-16.
Oracle 9i is utf-8, Oralce 10g is AL16UTF-16.
1. Create a new table with two columns: nchar and nvarchar2.
SQL> create table test_nvarchar (col_nchar nchar (10), col_nvarchar2 nvarchar2 (10 ));
Table created
2. Insert some data
SQL> insert into test_nvarchar values ('yuan ', 'yuan Guangdong ');
1 row inserted
SQL> insert into test_nvarchar values (N 'yuan ', N 'yuan Guangdong ');
1 row inserted
(In versions earlier than 9i, when N is added during insertion, the processing method is different from the normal method. However, there has been a change at 10 Gbit/s, and adding or not adding N is the same. Here we just want to test)
SQL> insert into test_nvarchar values ('A', 'B ');
1 row inserted
Insert a line of English letters
3. view the storage mode of col_nchar columns in each row.
SQL> select col_nchar, dump (col_nchar) from test_nvarchar;
COL_NCHAR DUMP (COL_NCHAR)
----------------------------------------------------------------------------------------------------
Yuan Typ = 96 Len = 20: 136,129
A Typ = 96 Len = 20 :,
Yuan Typ = 96 Len = 20: 136,129
Typ = 96 is the same as char type encoding.
Len = 20 each row is 20 bytes in length. This is the same as char. All are fixed length and will be filled with spaces.
Note that all characters are represented by two characters.
136,129 indicates 'yuan'
0, 97 indicates 'A'
0, 32 indicates space.
4. nvarchar2 Storage
SQL> select col_nvarchar2, dump (col_nvarchar2) from test_nvarchar;
COL_NVARCHAR2 DUMP (COL_NVARCHAR2)
----------------------------------------------------------------------------------------------------
Yuan Guangdong Typ = 1 Len = 6: 136,129
B Typ = 1 Len = 2: 0, 98
Yuan Guangdong Typ = 1 Len = 6: 136,129
Typ = 1 is the same as varchar2.
The len values of each row are different. No space is used for filling.
Each character occupies two bytes for storage.
B Storage: 0, 98
RMB storage: 136,129
5. Data Definition of nchar and nvarchar2.
SQL> desc test_nvarchar;
Name Type Nullable Default Comments
-------------------------------------------------
COL_NCHAR NCHAR (20) Y
COL_NVARCHAR2 NVARCHAR2 (20) Y
Although nchar and nvarchar2 are defined, the specified length refers to the number of characters. However, the table structure definition still stores the number of bytes.
During definition, nchar (10) indicates that up to 10 characters can be stored.
When viewing the data table structure, the maximum number of bytes occupied by this column is displayed.
Note that the actual storage values of Chinese characters are different in char and nchar. Because different character sets are used, different character encodings are available.
SQL> insert into test_varchar values ('yuan ');
1 row inserted
SQL> select col, dump (col) from test_varchar where col = 'yuan ';
Col dump (COL)
------------------------------------------------------------------------------------------
Yuan Typ = 1 Len = 2: 212,172
The default Character Set ZHS16GBK is used.
It is easy to convert it into ascii code.
High: * 256 (Power 8 of 2) + low.
212*256 + 172 = 54444
SQL> select chr (54444) from dual;
CHR (1, 54444)
----------
Yuan
In Nchar and Nvarchar, the character set of UTF-8 or UTF-16 is used.
SQL> insert into test_nvarchar values ('yuan ', 'yuan ');
1 row inserted
SQL> select col_nvarchar2, dump (col_nvarchar2) from test_nvarchar where col_nvarchar2 = 'yuan ';
COL_NVARCHAR2 DUMP (COL_NVARCHAR2)
----------------------------------------------------------------------------------------------------
Yuan Typ = 1 Len = 2: 136,129
The value of 'yuan 'is 136,129.
Oracle 10 and above both use the UTF-16 character set for nchar and nvarchar. It uses fixed-length bytes for storage (2 bytes) and supports multi-country characters, which improves the operation efficiency. However, it is not compatible with ascii codes.
§ 1. 6 RAW
RAW is compared with CHAR and VARCHAR2. RAW is binary data and can be called a binary string. Character Set conversion is performed when the CHAR and VARCHAR2 types are stored. If binary data is stored, Character Set conversion is not performed.
SQL> create table test_raw (col_chr varchar2 (10), col_raw raw (10 ));
Table created
SQL> insert into test_raw values ('A', 'A ');
1 row inserted
SQL> commit;
Commit complete
SQL> select * from test_raw;
COL_CHR COL_RAW
------------------------------
Aa AA
SQL> select col_chr, dump (col_chr) from test_raw;
COL_CHR DUMP (COL_CHR)
------------------------------------------------------------------------------------------
Aa Typ = 1 Len = 2: 97,97
SQL> select col_raw, dump (col_raw) from test_raw;
COL_RAW DUMP (COL_RAW)
----------------------------------------------------------------------------------------------------
AA Typ = 23 Len = 1: 170
Through the above analysis, although we obtain the result through the select query, the raw column is displayed as the inserted character. However, we learned through the dump function that raw is not stored as a character. It considers the inserted characters as hexadecimal values.
For example, in this example, we insert aa into the raw column, but it occupies 1 byte. It is 170.
170 is converted to hex, Which is aa.
An implicit HEXTORAW conversion occurs when data is inserted into the raw column.
An implicit conversion of RAWTOHEX occurs when reading data from raw columns.
If the insert value to the raw column is not a valid hexadecimal value, an error is returned.
SQL> insert into test_raw values ('h', 'H ');
Insert into test_raw values ('h', 'H ')
ORA-01465: invalid hexnumber

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.