Reprint: http://space.itpub.net/17203031/viewspace-744477
An important feature of relational database products (RDBMS) is that data information is organized into two-dimensional data tables, and the expression of information can be accomplished through a series of associations (joins). Specific database products in the implementation of this standard, there are different characteristics. is a specific database RDBMS product, often also provides different implementation methods.
1, from the heap table (Heap table) to the Index organization table (index organization table)
Oracle as a mature database software product, it provides a variety of data table storage structure. Our most common is three species, namely the heap table (Heap table), the Index organization table (Indexed organization table, referred to as IOT) and the Clustered table (Cluster table).
Heap table is the data table that we use most often in Oracle and is the default datasheet storage structure for Oracle. In heap table, data rows are managed in a "random access" manner. From the block head to the high water level of the space, Oracle is in accordance with the random way of "extensive" management. When a piece of data needs to be inserted into a datasheet, by default, Oracle searches below the high watermark for free space to hold the new data row. If you can find such a place, Oracle will put this line of data on the empty space. Note that this vacancy selection is entirely based on the principle of "being able to drop", which may be the overwrite bit of the deleted data row.
If the heap table segment does not find a suitable location under the HWM, the Oracle heap table goes up and pushes the watermark up high. On a data row store, the data rows of the Heap table are completely out of order. We call this "random access" feature.
For Heap Table, the addition of indexed independent segments can generally effectively mitigate the retrieval pressure caused by random access. The data row keys and ROWID values that are recorded on the index leaf node allow the Server process to navigate directly to the block position of the data row.
A clustered (Cluster Table) is a case of merging segment storage. Oracle is of the view that if some data tables are not updated frequently but are often displayed in a connection query (join) with another datasheet, they can be organized in a storage structure to maximize performance efficiencies. In the case of a clustered table, multiple data tables are saved in the order of the join keys.
In a system environment, we don't use cluster table too much. Data dictionaries in Oracle use clusters in large numbers. Compared to the various associated base tables, a fixed connection between the search scenarios is more, thus determining the scenario.
Finally, this series of IoT (Index organization Table). Like cluster table, IoT is a "non-mainstream" of the Oracle Datasheet strategy, with a narrower application scenario. But some cases use it, often can have very good effect.
Simply put, IoT is different from the biggest feature of the heap table, is that the data row organization is not random, but according to the data table primary key, according to the index tree to save. From the segment segment structure, the IoT index segment includes all rows of data, and no separate data table segments exist.
IoT has some special place in the preservation structure, which is applied to some special scenes. This series will analyze some of the characteristics of IoT, and finally discuss what kind of scenarios we can choose IoT as a datasheet scenario.
2. IoT Foundation
In creating use IoT, we should emphasize the role of primary key. Primary key is dispensable for general heap tables. One argument is that when a heap table does not set a primary key, the ROWID pseudo column is the corresponding primary key value. Also, Primary key can append settings after the datasheet is created.
However, IoT is particularly strict with primary keys, requiring that a clear primary key column be specified when the table is created. Below we have a series of experiments to prove that the experimental environment is Oracle 11g.
Sql> select * from V$version;
BANNER
------------------------------------
Oracle Database 11g Enterprise Edition release 11.2.0.1.0-production
Pl/sql Release 11.2.0.1.0-production
CORE 11.2.0.1.0 Production
We use the same structure to create the IoT and heap table controls.
--You cannot create a IoT without specifying a primary key;
Sql> CREATE table m (ID number) organization index;
CREATE table m (ID number) organization index
ORA-25175: No PRIMARY KEY constraint was found
Use organization index after the CREATE TABLE statement to specify that the data table creation structure is IoT. However, it is not allowed to create a table without specifying a primary key primary key.
Sql> CREATE TABLE T_iot (object_id number (a) primary key, object_name varchar2 (MB)) organization Index;
Table created
Sql> CREATE TABLE T_heap (object_id number (a) primary key, object_name VARCHAR2 (100));
Table created
(Insert the same data source row ...) )
sql> exec dbms_stats.gather_table_stats (user, ' T_iot ', cascade => True);
Pl/sql procedure successfully completed
sql> exec dbms_stats.gather_table_stats (user, ' T_heap ', cascade => True);
Pl/sql procedure successfully completed
From the data dictionary level, we analyze the difference of two data tables, a glimpse of the characteristics of IoT.
Sql> Select table_name, tablespace_name, blocks, num_rows Fromuser_tableswhere, table_name in (' T_iot ', ' T_HEAP ');
TABLE_NAME Tablespace_name BLOCKS Num_rows
------------------------------ ------------------------ ---------- ----------
T_heap SYSTEM 157 72638
T_iot 72638
Sql> Select Segment_name, blocks, extents from user_segments wheresegment_name in (' T_iot ', ' t_heap ');
Segment_name BLOCKS Extents
-------------------- ---------- ----------
T_HEAP 256 17
The above two-sentence SQL reveals several issues. First, Oracle acknowledges that IoT is a data table and counts the number of rows of data. However, there is no explicit description of the storage table space and size for the datasheet, and the contents of this section of the User_tables view are empty.
Secondly, in terms of segment structure, Oracle explicitly does not recognize the existence of T_iot segments. Because if there is a segment segment object, it means there is space allocation. But the data table has data, where it is stored.
We know that Oracle automatically adds a unique index to the datasheet when it is indexed. So let's check out the structure of this section.
Sql> Select Index_name, Index_type, TABLE_NAME, Pct_threshold, clustering_factor from user_indexes where table_name (' T_iot ', ' t_heap ');
Index_name Index_type table_name Pct_threshold clustering_factor
-------------------- -------- ---------- ------------- -----------------
sys_c0012408 NORMAL t_heap 256
Sys_iot_top_75124iot-top T_iot 50 0
Sql> Select Segment_name, blocks, extents from user_segments where segment_name in (' sys_c0012408 ', ' sys_iot_top_ 75124 ');
Segment_name BLOCKS Extents
-------------------- ---------- ----------
sys_c0012408 256 17
SYS_IOT_TOP_75124 256 17
The index segment is present, and the index type is explicitly annotated with the IoT index. This illustrates several issues:
First, for IoT, there are only index segments and no data segments. As a general index, only the index columns and rowid are on the leaf node. For IoT, the primary key index corresponds to the data row and index column values.
Second, the IoT overflow segment Threshold (pct_threshold). This is a special strategy for Oracle IoT. Simply put, when we save all the data rows on the leaf node, once the primary key value changes, the new value inserts, deletes and so on, the index leaf block's splitting action is frequent. Data rows stored on leaf nodes will only make such splitting action more frequent and serious. Oracle proposes to store part of the Non-key columns separately, which is the proportional value.
Finally, we discuss the clustering Factor of the IoT index. Clustering factor is a comprehensive index which reflects the order of index leaf nodes and the direct dispersion of data preservation rows. In general, the clustering factor of the heap table is the process of deteriorating with the DML operation. Clustering factor is an important parameter (http://space.itpub.net/17203031/viewspace-680936) that affects the cost of the Oracle index path, which affects the cost decisions of the CBO. The value of the IoT index is always 0, because the order of the indexes is the order of the data rows, both stored in the same order and absolutely consistent.
3, IoT and implementation plan
What is our usual execution plan under the IoT datasheet? Common heap table and IoT differ greatly in this section.
Generally, the index path of the Heap table accompanies the reading of the two-segment structure-the index segment and the data segment. Read the index segment header first, experience the root node, branch node, leaf node, and finally get to the result set ROWID list. After that, you use the ROWID to query the rows of the datasheet in turn.
But IoT tables can be different. The index and the data remain together, theoretically getting the leaf node, which is the data row. IoT there is no back-table operation, so the cost of the return table is saved relative to heap table.
Let's look at the characteristics of the IoT by implementing the plan below.
Sql> explain plan for SELECT * from T_iot whereobject_id=1000;
Explained
Sql> select * FROM table (dbms_xplan.display);
Plan_table_output
---------------------------------------------------------------------------
Plan Hash value:2277898128
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Tim
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 11 | 1 (0) | 00:
|* 1 | INDEX UNIQUE scan| sys_iot_top_75124| 1 | 11 | 1 (0) | 00:
-------------------------------------------------------------------------------
predicate information (identified by Operation ID):
---------------------------------------------------
1-access ("object_id" =1000)
Rows selected
Sql> explain plan for SELECT * from T_iot;
Explained
Sql> select * FROM table (dbms_xplan.display);
Plan_table_output
-------------------------------------------------------------------------------
Plan Hash value:4201110863
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 72638 | 780k| 47 (0) |
| 1 | INDEX FAST full scan| sys_iot_top_75124 | 72638 | 780k| 47 (0) |
------------------------------------------------------------------------
8 Rows selected
For IoT, we want to make sure that the data table is accessed primarily by the primary key path. In the two execution plans above, we searched by the primary key and the path is the index Unique Scan. The entire table is scanned for index Fast full Scan. Neither has an obvious return action.
Imagine that if the datasheet is smaller, the Index full scan is also the execution path that often occurs in the IoT table.
For the general heap Table, what about the execution path?
Sql> explain plan for SELECT * from T_heap where object_id=1000;
Explained
Sql> select * FROM table (dbms_xplan.display);
Plan_table_output
----------------------------------------------------------------------------
Plan Hash value:1833345710
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 11 | 2 (0)
| 1 | TABLE ACCESS by INDEX rowid| T_heap | 1 | 11 | 2 (0)
|* 2 | INDEX UNIQUE SCAN | sys_c0012408 | 1 | | 1 (0)
--------------------------------------------------------------------------
predicate information (identified by Operation ID):
---------------------------------------------------
2-access ("object_id" =1000)
Rows selected
Sql> explain plan for SELECT * from T_heap;
Explained
Sql> select * FROM table (dbms_xplan.display);
Plan_table_output
---------------------------------------------------------------------------
Plan Hash value:1253663840
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 72638 | 780k| 42 (0) | 00:00:01 |
| 1 | TABLE ACCESS full| T_heap | 72638 | 780k| 42 (0) | 00:00:01 |
----------------------------------------------------------------------------
8 Rows selected
Normal heap tables do not prevent back-table actions.
Finally, we want to make a statement about the cost impact of the return action. IoT and heap Table A big difference in the execution plan is to return to the table. But in terms of cost, the CBO did not determine the execution plan because of the return action, but clustering the factor effect.
For the heap table, clustering factor is a big problem, whether it is the CBO cost formula, or the prospect of continuous degrade. IoT a prominent advantage is the direct elimination of clustering factor cost factors.
But this also poses the problem that a data table can only be organized according to the order of the primary key, and the problem of the secondary index (secondary index) is a controversial topic for many versions of Oracle and IoT users. The secondary index problem we will continue to discuss later.
In the next section we will continue to discuss other content such as IoT maintenance.