Index experiment example

Source: Internet
Author: User

Index experiment example
Index-based examples of simple experiments

**************************************** ******************************** **************** 

Original works, from "Deep Blue blog" blog, welcome to reprint, reprint please be sure to indicate the source (http://blog.csdn.net/huangyanlong ).

Please leave a message if the statement is incorrect. Thank you very much.

Reminder: Click the directory to view the directory.

****************************************************************** ***********************************


I have summarized the previous small examples, hoping to help you further understand the index.

[Example 1] indexing is not required if the data volume is small.

// If the table has a small amount of data and the full table scan cost is very small, it is unnecessary to use the index. Tutorial operation: SQL> SELECT ENAME, JOB, SAL FROM SCOTT. EMP; // first, find a small table for the experiment and view the table information, only 14 rows of ename job sal ---------- --------- ------ smith clerk 800 allen salesman 1600 ward salesman 1250 jones manager 2975 martin salesman 1250 blke MANAGER 2850 clark manager 2450 scott analyst 3000 king president 5000 turner salesman 1500 ADAMS CLERK 1100 james clerk 950 ford analyst 3000 miller clerk 1300 has selected 14 rows. SQL> SET AUTOTRACE ONSQL> SET AUTOTRACE TRACEONLYSQL> SELECT * FROM SCOTT. emp where ename = 'James '; // scan the entire table to find information about JAMES. Required | Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time | ------------------------------------------------------------------------ | 0 | select statement | 1 | 38 | 3 (0) | 00:00:01 | * 1 | table access full | EMP | 1 | 38 | 3 (0) | 00:00:01 | -------------------------------------------------------------------------- SQL> CREATE INDEX IND_EMP_ENAME ON SCOTT. EMP (ENAME); // index the ENAME column SQL> SELECT * FROM SCOTT. emp where ename = 'James '; // query information about JAMES using a column index | Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time | ------------------------- -------------------------------------------------- | 0 | select statement | 1 | 38 | 2 (0) | 00:00:01 | 1 | table access by index rowid | EMP | 1 | 38 | 2 (0) | 00:00:01 | * 2 | index range scan | IND_EMP_ENAME | 1 | 1 (0) | 00:00:01 | cost of full table SCAN is 3%, the index cost is 2% // from the above experiments, we found that the full table scan and index cost are not much different when the table data volume is small.

[Example 2] the I/O cost of full table scan is lower than the index usage

**************************************** * ********************************** For example, improper indexing is more costly than full table scanning. **************************************** *********************************** Answer: idea: creating a group of rowids is an index scattered in multiple table data blocks. As a result, the distribution of the index column data is very different from the order in the index, as a result, the cost of I/O usage is much lower by scanning the entire table than by indexing. The operation is AS follows: SQL> create table TAB_HYL as select * FROM DBA_OBJECTS; // CREATE a TAB_HYL TABLE for experiment SQL> ANALYZE TABLE TAB_HYL COMPUTE STATISTICS; // analyze the TAB_HYL Experiment Table SQL> SELECT NUM_ROWS, BLOCKS FROM USER_TABLES WHERE TABLE_NAME = 'tab _ hyl '; // find the number of rows and number of BLOCKS in the experiment table NUM_ROWS BLOCKS ---------- 72606 1033SQL> SELECT 72606/1033 from dual; // calculate the average number of rows in each block as 70 rows: 72606/1033 ---------- 70.286544SQL> drop table TAB_HYL PURGE; // Delete this TABLE, which is The number of rows occupied by each partition is calculated, and the experiment SQL> CREATE TABLE TAB_HYL AS SELECT * FROM DBA_OBJECTS WHERE ROWNUM <= 70; // re-create the experiment table so that it loads 70 rows to form the first SQL block> INSERT INTO TAB_HYL SELECT * FROM TAB_HYL; // copy the same 70 rows and INSERT them INTO the experiment table, that is, the experiment table has a total of 140 rows of data, and the two blocks of SQL> // re-execute the same operation, but the benchmark experiment table is 140 rows, so 140 rows of data are inserted for the third time, that is to say, the experiment table now has 280 rows of Data SQL> // follow the above method to create multiple blocks consecutively, make each block have the same key value and form a group of ROWIDSQL>/SQL> COMMIT for the experiment; SQL> CREATE INDEX IND_H1 ON TAB_HYL (OBJECT_ID );// Create an index for the OBJECT_ID column in the experiment TABLE, and then use the column value for query to describe the cost of the query. SQL> analyze table TAB_HYL compute statistics; // ANALYZE the SQL> SELECT NUM_ROWS, blocks from USER_TABLES WHERE TABLE_NAME = 'tab _ hyl'; // check that the number of rows and BLOCKS in the experiment table have reached the test preparation conditions, you can start experimenting with NUM_ROWS BLOCKS ---------- 8960 103SQL> set autotrace onsql> set autotrace traceonly // SET the trace SQL> SELECT * FROM TAB_HYL WHERE OBJECT_ID = 70; // search for the index column above and get the following analysis result. Remember that the cpu cost is 30 and the database is automatically completed. The full table scan means that the database has determined the query method, and the cost is lower. Role | Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time | percent | 0 | select statement | 128 | 10112 | 30 (0) | 00:00:01 | * 1 | table access full | TAB_HYL | 128 | 10112 | 30 (0) | 00:00:01 | ----------------------------------------------------------------------/later Let's take an index on the query and look at the analysis results. SQL> SELECT/* + INDEX (TAB_HYL IND_H1) */* FROM TAB_HYL WHERE OBJECT_ID = 70; // force the query to take the INDEX and output the result. The cost is 102, the cost is much higher than the full table scan cost (the full table scan is 30, as shown in the table above ). STATEMENT | Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time interval | 0 | select statement | 128 | 10112 | 102 (0) | 00:00:02 | 1 | table access by index rowid | TAB_HYL | 128 | 10112 | 102 (0) | 00:00:02 | * 2 | index range scan | IND_H1 | 128 | 1 (0) | 00:00:01 | ------- --------------------------------------------------------------- The preceding experiment shows that when the distribution of index column data is very different from the order in the index, the efficiency of index range scanning is low.

[Example 3] When constructing a table, the number of cluster factors is close to the number of blocks and the number of close rows.

**************************************** ************************************ To create two tables, create an index. It is required that the index set cluster factor of Table A be close to the number of table blocks, and the index set cluster factor of Table B be close to the number of table rows. **************************************** * ********************************** (1) create table a: The index set cluster factor is close to the number of TABLE blocks. Operation: SQL> CREATE TABLE TAB_HYL AS SELECT * FROM DBA_OBJECTS; // A TAB_HYL TABLE is created first for the experiment source TABLE, to ANALYZE the number of rows occupied by a block in the TABLE through this TABLE, SQL> analyze table TAB_HYL compute statistics; // ANALYZE the TAB_HYL Experiment table SQL> SELECT NUM_ROWS, blocks from USER_TABLES WHERE TABLE_NAME = 'tab _ hyl'; // you can find the number of rows and number of BLOCKS in the experiment table NUM_ROWS BLOCKS ---------- 72606 1033SQL> SEL ECT 72606/1033 from dual; // calculate the average number of rows in each block as 70 rows 72606/1033 ---------- 70.286544SQL> drop table TAB_HYL PURGE; // Delete this table SQL> CREATE TABLE TAB_HYL AS SELECT * FROM DBA_OBJECTS WHERE ROWNUM <= 70; // re-create the experiment table so that it loads 70 rows to form the first SQL block> INSERT INTO TAB_HYL SELECT * FROM TAB_HYL; // copy the same 70 rows and INSERT them INTO the experiment table, that is, the experiment table has a total of 140 rows of data, and the two blocks of SQL> // re-execute the same operation, but the benchmark experiment table is 140 rows, so 140 rows of data are inserted for the third time, that is, the experiment table now has 280 rows of Data SQL> // follow the above method to create continuously, this is to construct the cluster factor SQL of the Experiment Table>/SQL>/ SQL>/SQL> COMMIT; SQL> CREATE TABLE TAB_A AS SELECT * FROM TAB_HYL ORDER BY OBJECT_ID; // CREATE TABLE A based on the experiment TABLE, and TABLE A is sorted BY OBJECT_ID, therefore, the block SQL> CREATE INDEX IND_H1 ON TAB_A (OBJECT_ID) with the same key value distribution is obtained ); // create an index SQL> analyze table TAB_A compute statistics for the OBJECT_ID column in table a; // analyze table a SQL> SELECT NUM_ROWS, BLOCKS FROM USER_TABLES WHERE TABLE_NAME = 'tab _ '; NUM_ROWS BLOCKS ---------- 8960 102SQL> select blevel, LEAF_BLOCKS, DISTINCT_KEYS, AVG_LEAF_BLOCKS_PER_KEY, CLUSTERING_FACTOR 2 FROM USER_INDEXES 3 WHERE INDEX_NAME = 'ind _ H1 '; // view the B-tree level of the index column of Table A, the number of leaf blocks, different key values, the average number of leaf blocks occupied by each key, and the clustering factor BLEVEL LEAF_BLOCKS DISTINCT_KEYS AVG_LEAF_BLOCKS_PER_KEY CLUSTERING_FACTOR ------- ----------- ------------- --------------------- ------------------- 1 18 70 1 102 // obtain the number of cluster factors in the index column of Table A (102) it is the same as the number of blocks (102) in table. (2) create table B: The index set cluster factor is close to the number of rows in the TABLE operation: SQL> CREATE TABLE TAB_HYL AS SELECT * FROM DBA_OBJECTS; // create an experiment TABLE for the experiment SQL> analyze table TAB_HYL compute statistics; // ANALYZE this TAB_HYL Experiment table SQL> SELECT NUM_ROWS, blocks from USER_TABLES WHERE TABLE_NAME = 'tab _ hyl'; // you can find the number of rows and number of BLOCKS in the experiment table NUM_ROWS BLOCKS ---------- 72606 1033SQL> SELECT 72606/1033 from dual; // calculate the average number of rows in each block as 70 rows: 72606/1033 ---------- 70.286544SQL> drop table TAB_HYL PURGE; // Delete this table SQL> create table TAB_ B AS SELECT * FROM DBA_OBJECTS WHERE ROWNUM <= 70; // Create Table B and load it INTO 70 rows to form the first SQL block> insert into TAB_ B SELECT * FROM TAB_ B; // copy the same 70 rows and INSERT them INTO Table B, that is to say, table B has a total of 140 rows of data, and two blocks of SQL> // perform the same operation again, but the base table B is 140 rows, so 140 rows of data are inserted for the third time, that is to say, table B now has 280 rows of Data SQL> // follow the above method to create the following rows consecutively, this is to construct the cluster factor SQL>/SQL> COMMIT of Table B; SQL> create index IND_H2 ON TAB_ B (OBJECT_ID ); // create an index SQL> analyze table TAB_ B compute statistics for the OBJECT_ID column in table B; // analyze table B SQL> SELECT NUM_ROWS, blocks from USER_TABLES WHERE TABLE_NAME = 'tab _ B '; // check the number of rows and number of BLOCKS in Table B. NUM_ROWS BLOCKS ---------- 8960 103SQL> SELECT BLEVEL, LEAF_BLOCKS, DISTINCT_KEYS, AVG_LEAF_BLOCKS_PER_KEY, CLUSTERING_FACTOR 2 FROM USER_INDEXES 3 WHERE INDEX_NAME = 'ind _ H2 '; // view the B-tree level of the index column of Table B, the number of leaf blocks, different key values, the average number of leaf blocks occupied by each key, the set clustering factor BLEVEL LEAF_BLOCKS DISTINCT_KEYS AVG_LEAF_BLOCKS_PER_KEY CLUSTERING_FACTOR ------- ----------- ------------- --------------------- ------------------- 1 18 70 1 7070 // The cluster factor (7070) of the index column of Table B) it is close to the number of rows (8960) in table B.

[Example 4] index monitoring

**************************************** * ************************************ For the index of a table monitoring, check whether it is used. **************************************** * ********************************** Session: SQL> alter index IND_H1 monitoring usage; // enable monitoring SQL for the IND_H1 INDEX used in the above exercise> SELECT * FROM V $ OBJECT_USAGE; // view the IND_H1 index monitoring information in the V $ OBJECT_USAGE view. If MON is YES, monitoring is enabled, currently, NO one uses INDEX_NAME TABLE_NAME mon use START_MONITORING END_MONITORING before reading --- begin ------------------- IND_H1 TAB_A yes no 03/18/2014 16:54:29 session B: SQL> SET AUTOTRACE ON; // enables monitoring, make sure that the following operations are performed using the indexed SQL> SELECT * FROM TAB_A WHERE OBJECT_ID = 70; // use the indexed column for query, analysis results: -------------------------------------------------------------------------------- | 0 | select statement | 128 | 10112 | 3 (0) | 00:00:01 | 1 | table access by index rowid | TAB_A | 128 | 10112 | 3 (0) | 00:00:01 | * 2 | index range scan | IND_H1 | 128 | 1 (0) | 00:00:01 | explain session A: SQL> SELECT * FROM V $ OBJECT_USAGE; // view the IND_H1 index monitoring information in the V $ OBJECT_USAGE view again. If MON is YES, monitoring is enabled, if USE is YES, someone is using INDEX_NAME TABLE_NAME mon use START_MONITORING END_MONITORING --------------------- begin --- --------------------------------- IND_H1 TAB_A YES 03/18/2014 16:54:29

**************************************** *******Statement************************************************ 

Original works, from "Deep Blue blog" blog, welcome to reprint, reprint please be sure to indicate the source (http://blog.csdn.net/huangyanlong ).

Please leave a message if the statement is incorrect. Thank you very much.

Reminder: Click the directory to view the directory.

****************************************************************** ***********************************









What is an index? Why index creation? Example (take a specific DBMS as an example)

Indexing is another way to represent data. The data sequence provided by indexing is different from the physical storage sequence of data on disks. The special function of an index is to rearrange the physical location of records in a table. Indexes can be created on a column of a data table or on a combination of several columns of a table.

In retrospect, a book is stored in the library. Suppose you want to find a book, and the librarian looks for it as needed until he finds the book he wants. This search is time-consuming and laborious, so the administrator can create index cards for all book titles by letter and put them in different office drawers in alphabetical order, in this way, the process of searching for a book becomes to querying the index card in the drawer in front of the desk, and it is easy to get information about the book.

Further imagine that the librarian is very diligent. He not only classifies the titles by letter, but also creates another index card by the author's name and press, this process will give readers more flexibility in searching book information. Therefore, you can create multiple indexes in a data table.

In a database, querying a table without an index is generally called a full table scan. Full table scan is used by the database server to search for each record of the table until all records meeting the given conditions are returned. This operation can be compared to checking books in the library, starting from the first book on the first shelf, browsing every book until you find the desired book. For efficient query, you can create an index for a certain field in the data table. Because the index includes a pointer to the data, the Database Server reads the index of only one column of data (only one index is created) in the order of index arrangement until the index Pointer Points to the corresponding record. Because the index is only searched by a field, but does not traverse the entire table, index search is generally faster than full table scan.

So, is index query faster than full table scan? The answer is no. If you query a small data table (with few records) or a large data table (with many records), full table scan is more practical. For example, to query the "gender" field, the value can only be "male or female", and it is of little significance to create an index on it, it is not allowed to create indexes on Boolean or large binary (remark or image.

Example: www.diylsoft.com/..ze.htm
Clustered index:
Baike.baidu.com/view/1028053.html? Wtp = tt

Give me an example of a small scientific experiment.

You don't need to say that it's about that experiment. As long as the experiment design is reasonable, it's a small scientific experiment.
 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.