Brief description of Oracle IOT (Index organized Table)

Source: Internet
Author: User

An important feature of relational database products (RDBMS) is that data information is organized into two-dimensional data tables, and the expression of information can be accomplished through a series of associations (joins). Specific database products in the implementation of this standard, there are a variety of characteristics. is a specific database RDBMS product, often also provides different implementation methods.

1. From the heap table to the Index organization table (index Organization table)

Oracle, as a mature database software product, provides a variety of data table storage structure. The three most common are the heap table, the Index organization table (index Organization table, which is referred to as the IOT), and the Clustered tables (Cluster table).

Heap table is the most commonly used data table in Oracle and the default data table storage structure for Oracle. In the heap table, the data rows are managed in a "random access" manner. From the end of the block to the high watermark, Oracle is "extensive" in random ways. When a piece of data needs to be inserted into a data table, by default, Oracle will look for free space below the high watermark to accommodate the new data row. If such a place could be found, Oracle would put this line of data on the vacancy. Note that this vacancy is chosen entirely according to the "can be put down" principle, this vacancy may be deleted data row of the overlay.

If no suitable location is found under the HWM of the heap table segment, the Oracle heap table is pushed up the high watermark. On the data row store, the data rows of the Heap table are completely out of order. We call this a "random access" feature.

For Heap Table, the addition of the index independent segment can effectively alleviate the retrieval pressure caused by random access. The data row key value and rowID value recorded on the index leaf node allow the Server process to navigate directly to the block position of the data row.

A clustered (Cluster Table) is a case where a merge segment is stored. Oracle believes that if some data tables are not updated at a high frequency, but are often linked to another data table (join) display, it can be organized in a storage structure, so as to maximize performance efficiency. For clustered tables, multiple data tables are kept together in the order of the connection keys.

In a system environment, we don't use cluster table much. Data dictionaries in Oracle are heavily clustered. The scenario is determined by comparing the number of scenarios in which a fixed connection is retrieved between the various associated base tables.

Finally, this series of IoT (Index Organization Table). Like cluster table, IoT is a "non-mainstream" approach to Oracle's data-table strategy, with narrow application scenarios. But using it in some cases can often have a very good effect.

In a nutshell, the biggest feature of IoT that distinguishes the heap table is that the organization of the data rows is not random, but is saved by the index tree according to the primary key of the data table. From the segment segment structure, the IoT index segment includes all rows of data, and there are no separate data table segments.

The IoT has something special to preserve the structure, and it's applied to some special scenarios. This series will analyze some of the characteristics of the IoT one by one and finally discuss what scenarios we can choose to use IoT as a data table scenario.

2. IoT Basics

In creating the use of IoT, we want to emphasize the role of primary key. For a typical heap table, Primary key is optional. One argument is that when a heap table does not have a primary key set, the ROWID pseudo-column is the corresponding primary key value. Also, Primary key can be appended after the data table is created.

However, the IoT has an exceptionally strict setting for primary keys, requiring that you specify explicit primary key columns when creating tables. Below we have demonstrated through a series of experiments that the experimental environment is Oracle 11g.

Sql> select * from V$version;

BANNER

------------------------------------

Oracle Database 11g Enterprise Edition Release 11.2.0.1.0-production

PL/SQL Release 11.2.0.1.0-production

CORE 11.2.0.1.0 Production

We use the same structure to create the IoT and heap table controls.

--You cannot create an IoT without specifying a primary key;

Sql> CREATE table m (ID number) organization index;

CREATE table m (ID number) organization index

ORA-25175: no PRIMARY KEY constraint found

Use organization index after the CREATE TABLE statement to specify that the data table creation structure is IoT. However, it is not allowed to build a table without specifying the primary key primary key.

Sql> CREATE TABLE T_iot (object_id number (Ten) primary key, object_name varchar2 (+)) organization index;

Table created

Sql> CREATE TABLE T_heap (object_id number (Ten) primary key, object_name VARCHAR2 (100));

Table created

(Insert the same data source line ...) )

sql> exec dbms_stats.gather_table_stats (user, ' T_iot ', cascade = TRUE);

PL/SQL procedure successfully completed

sql> exec dbms_stats.gather_table_stats (user, ' t_heap ', cascade = TRUE);

PL/SQL procedure successfully completed

From the data dictionary level, we analyze the differences of the two data tables to see the characteristics of the IoT.

Sql> Select table_name, tablespace_name, blocks, num_rows fromuser_tableswhere table_name in (' T_iot ', ' T_HEAP ');

TABLE_NAME Tablespace_name BLOCKS Num_rows

------------------------------ ------------------------ ---------- ----------

T_heap SYSTEM 157 72638

T_iot 72638

Sql> Select Segment_name, blocks, extents from user_segments wheresegment_name in (' T_iot ', ' t_heap ');

Segment_name BLOCKS Extents

-------------------- ---------- ----------

T_HEAP 256 17

The above two sentences of SQL reveal several problems. First, Oracle acknowledges that IoT is a data table and counts the number of rows of data. However, there is no explicit description of the storage table space and size of the data table, and the contents of this section in the User_tables view are empty.

Second, from the segment structure, Oracle explicitly does not recognize the existence of the T_iot segment. Because if there is a segment segment object, it means there is space allocation. But the data sheet has data, where is it stored?

We know that when you add an index to a data table, Oracle automatically adds a unique index. So let's check out the structure of this part.

Sql> Select Index_name, Index_type, TABLE_NAME, Pct_threshold, clustering_factor from user_indexes where table_name in (' T_iot ', ' t_heap ');

Index_name Index_type table_name Pct_threshold clustering_factor

-------------------- -------- ---------- ------------- -----------------

sys_c0012408 NORMAL t_heap 256

Sys_iot_top_75124iot-top T_iot 50 0

Sql> Select Segment_name, blocks, extents from user_segments where segment_name in (' sys_c0012408 ', ' sys_iot_top_75124 ‘);

Segment_name BLOCKS Extents

-------------------- ---------- ----------

sys_c0012408 256 17

SYS_IOT_TOP_75124 256 17

The index segment is present, and the index type is explicitly labeled as an IoT index. This illustrates several issues:

First, for IoT, there is only an index segment and no data segment. As a general index, only the values and ROWID of indexed columns are available on leaf nodes. For IoT, the corresponding data row and index columns are the values on the primary key index.

Second, the overflow segment threshold for the IoT (Pct_threshold). This is a special strategy for Oracle IoT. Simply put, when we keep all the data rows on the leaf node, the split motion of the index leaf block is frequent once the change of the primary key value, the new value is inserted, the deletion and so on. Data rows stored on leaf nodes only make such splits more frequent and serious. Oracle proposes to store a portion of the non-primary key columns separately, which is the scale value.

Finally, let's explore the clustering Factor of the IoT index. Clustering factor is a comprehensive index that reflects the order of leaf nodes and the direct dispersion of data preservation lines. In general, the clustering factor of a heap table is a process that continues to degrade with DML operations. Clustering factor is an important parameter (http://space.itpub.net/17203031/viewspace-680936) that affects the cost of the Oracle index path, which affects the cost decision of the CBO. The value of this part of the IoT index is always 0, because the order of the indexes is the order of the data rows, both of which are stored in the same order and are absolutely identical.

3. IoT and execution plan

What happens to our usual execution plans under the IoT data table? Common heap table and IoT differ greatly in this section.

In general, the index path of the Heap table is accompanied by the reading of the two-time segment structure-the index segment and the data segment. Read the index segment header first, go through the root node, branch node, leaf node, and finally get to the result collection rowID list. After the table operation, use ROWID to query the rows of the data table in turn.

But IoT tables can be different. The index and the data are kept together, theoretically to the leaf node, that is, to get the data row. The IoT does not have a back-table operation, so relative to the heap table, the cost of the back-table portion is saved.

Let's look at the characteristics of the IoT by executing the plan.

Sql> explain plan for SELECT * from T_iot whereobject_id=1000;

Explained

Sql> select * FROM table (dbms_xplan.display);

Plan_table_output

---------------------------------------------------------------------------

Plan Hash value:2277898128

--------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Tim

---------------------------------------------------------------------------

| 0 |                SELECT STATEMENT |    |   1 |    11 | 1 (0) | 00:

|* 1 | INDEX UNIQUE scan|   sys_iot_top_75124|   1 |    11 | 1 (0) | 00:

-------------------------------------------------------------------------------

predicate information (identified by Operation ID):

---------------------------------------------------

1-access ("object_id" =1000)

Rows selected

Sql> explain plan for SELECT * from T_iot;

Explained

Sql> select * FROM table (dbms_xplan.display);

Plan_table_output

-------------------------------------------------------------------------------

Plan Hash value:4201110863

-------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU) |

-------------------------------------------------------------------------------

| 0 |                  SELECT STATEMENT | |  72638 |   780k| 47 (0) |

| 1 | INDEX FAST full scan| sys_iot_top_75124 |  72638 |   780k| 47 (0) |

------------------------------------------------------------------------

8 Rows selected

For the IoT, we want to ensure that the data table accessed is primarily the primary key path. In the above two execution plans, we follow the primary key to retrieve the path as index Unique Scan. The full table scan is index Fast fully scan. Neither of them has obvious back-table action.

Imagine that if the data table is small, Index full scan is also the execution path that the IoT table often appears.

For the general heap Table, what is the execution path?

Sql> explain plan for SELECT * from T_heap where object_id=1000;

Explained

Sql> select * FROM table (dbms_xplan.display);

Plan_table_output

----------------------------------------------------------------------------

Plan Hash value:1833345710

-----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)

---------------------------------------------------------------------------

| 0 |             SELECT STATEMENT |    |   1 |    11 | 2 (0)

| 1 | TABLE ACCESS by INDEX rowid|    T_heap |   1 |    11 | 2 (0)

|* 2 | INDEX UNIQUE SCAN |    sys_c0012408 |      1 |    | 1 (0)

--------------------------------------------------------------------------

predicate information (identified by Operation ID):

---------------------------------------------------

2-access ("object_id" =1000)

Rows selected

Sql> explain plan for SELECT * from T_heap;

Explained

Sql> select * FROM table (dbms_xplan.display);

Plan_table_output

---------------------------------------------------------------------------

Plan Hash value:1253663840

---------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time |

----------------------------------------------------------------------------

| 0 |       SELECT STATEMENT | |  72638 |   780k| 42 (0) | 00:00:01 |

| 1 | TABLE ACCESS full| T_heap |  72638 |   780k| 42 (0) | 00:00:01 |

----------------------------------------------------------------------------

8 Rows selected

Normal heap tables do not prevent back-table actions.

Finally, we want to declare the cost impact of the back table action. The IoT and heap table have a large execution plan difference, which is the back table. But in terms of cost, the CBO does not determine the execution plan because of the clustering, but the impact of the factor.

For heap tables, clustering factor is a big problem, whether it is the CBO's cost formula or the prospect of constant degrade. A prominent advantage of IoT is the direct elimination of the cost factors of clustering factor.

But this also leads to the problem that a data table can only be organized in the order of primary keys, and the question of secondary indexes (secondary index) is a controversial topic for many versions of Oracle and IoT users. Secondary index question we will continue to discuss it later.

Brief description of Oracle IOT (Index organized Table)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.