Original link 1 original link 2
3. Type of index
1) Clustered index: The physical order of the rows in the table is the same as the logical (indexed) Order of the key values. Because there can be only one physical order of data, a table can have only one clustered index. If a table does not have a clustered index, the table has no sequential concept, and all new rows are inserted at the end of the table. For clustered indexes, the leaf nodes store data rows and no longer have separate data pages. For example, when I looked at the dictionary never looked at the directory, I think the dictionary itself is a directory, such as Check PEI, only need to turn to the beginning of the P-letter, and then in order to find E. Through this method I can find the quickest way to the teacher said that word, get the teacher's praise.
2) Nonclustered indexes: the physical order of rows in a table is independent of the index order. For nonclustered indexes, the leaf node stores the index field values and pointers to the corresponding data pages. The leaf node is directly above the data, and the corresponding index row corresponds to each row of the data page. Sometimes in the dictionary, I do not know what the word read, then I have to through the dictionary directory of the "radicals" to find. At this time I will find that the sort in the directory and the actual body of the sort is not the same, this is very distressing for me, because I can not faster than others, I need to find the word in the directory, and then according to the number of pages to find the text of the word.
4. Index and data query, insert and delete
1) query. The query operation is the same as the dictionary search. When we go to find the specified record, the database finds the root node, compares the unknown origin data to the root node's data, and then queries the next record through the pointer of the root node until the record is found. This is a simple balance tree of the binary search process, I will not repeat. In the clustered index, the data row is found when the page node is found, and in the nonclustered index we need to read the data page again.
2) Insert. The insert operation for a clustered index is more complex, in the simplest case, where the insert operation finds the data page for, and then makes room for the new data to perform the insert operation. If the data page already has no space, you need to split the data page, which is a very resource-intensive operation. For tables with only nonclustered indexes, inserts are simply inserted at the end of the table. If a clustered index is also included, the insert operation required for the clustered index is also performed.
3) Delete. The data below the delete line moves up to fill the vacancy. If the deleted data is the last row of the data page, the data page is recycled, the pointer to the front and back of the page is changed, and the retrieved data page is reused in a specific situation. At the same time, for a clustered index, if there is only one record left on the index page, the record may be moved to a neighboring index table and the original index page will be recycled. The non-clustered index does not have the means to do this, which results in a situation where multiple data pages have only a small amount of data.
5. Advantages and disadvantages of indexes
In fact, through the previous introduction, the advantages and disadvantages of the index is clear.
First, the advantages:
1) greatly accelerate the retrieval speed of data, which is the main reason for creating indexes
2) accelerates the connection between tables and tables, especially when it is important to achieve referential integrity of data.
3) When using grouping and sorting clauses for data retrieval, you can also significantly reduce the time to group and sort in queries.
Besides the disadvantages:
1) It takes a while to create an index, but the problem is small, and the general index is built once
2) The index needs to occupy physical space, especially the clustered index, which requires a large space
3) When the data in the table is added, deleted and modified, the index should be maintained dynamically, reducing the maintenance speed of the data, this is a relatively large problem.
6. Use of indexes
Based on the above analysis, we have a rough idea of when to use the index (if you don't, look back again ...) )。 In general, we need to index these columns:
1) There is no doubt about the columns that are often needed for searching;
2) often query multiple columns at the same time, and each column contains duplicate values can be set up a composite index, combined index as far as possible to make common queries to form an index overlay (the query contains the required fields are contained in an index, we only need to search the index page to complete the query). Also, the leading column of the combined index must be the most frequently used column. For issues with leading columns, a discussion is also made in the introduction to the index usage of SQLite later.
3) often used in connected columns, these columns are mainly foreign keys, you can speed up the connection, the connection conditions to fully consider the table with the index. ;
4) Create an index on a column that often needs to search for a range, because the index is sorted, its specified range is contiguous, and, similarly, it is best to create indexes on columns that often need to be sorted.
5) Create an index on the column that is often placed in the WHERE clause to speed up the judgment of the condition. It is important to note that any action on a column in the WHERE clause, such as a calculation expression, a function, requires an entire table search for the table without using the index of that column. So try to move the operation to the right of the equals sign when querying.
We should not create an index for the following columns:
1) columns that are rarely used in queries
2) columns that contain very few non-repeating data values, such as 0, 1, are generally more effective when scanning the entire table
3) The index should not be created for data defined as Text,image. These fields are not fixed in length, may be long, or may be empty.
Of course, the index is not indexed when the update operation is much larger than the query operation. You can also consider a drop index before a large-scale update operation, and then recreate it, but this requires taking into account the resource consumption of the creation index. In short, using an index requires balancing inputs and outputs to find the best point for the output.
7. Using indexes in SQLite
1) SQLite does not support clustered indexes, Android requires a _id field by default, which ensures that the data you insert will be inserted in an integer order of "_id", and that the integer type's primary key will play the same role as the clustered index. So do not create an index on the primary key that is declared as: INTEGER PRIMARY key.
2) Many friends who are unfamiliar with the index have created an index in the table, but found that it did not take effect, in fact this is mostly related to what I am going to say next. There are some restrictions on the columns that appear in the WHERE clause, which are related to the leading column, for the index to take effect. The so-called leading column is the first column in the creation of a compound index statement or a contiguous number of columns. For example, by creating index comp_ind on table1 (x, y, z), X,XY,XYZ is the leading column, and Yz,y,z is not. These, for other databases, may have some minor differences, here is the standard SQLite. In the WHERE clause, the leading column must use the equals or in operation, and the rightmost column can use inequalities so that the index can take effect completely. Also, the columns in the WHERE clause do not need to be fully indexed, but you must ensure that there are no gaps between the indexed columns. Let's take a few examples:
Create an index with the following statement:
CREATE INDEX idx_ex1 on Ex1 (a,b,c,d,e,..., y,z);
Here is a query statement:
... WHERE a=5 and B in (A-D) and C is NULL and d= ' Hello '
This is obviously valid for ABCD four columns, because only equals and in operations, and is the leading column.
Then look at a query statement:
... WHERE a=5 and B in (c>12) and d= ' Hello '
The index of only A, B and C is valid, and the index of the D column is invalidated because it is on the right side of column C, and the C column uses an inequality, and column C already belongs to the far right, based on the restriction of using inequalities.
Finally, look at one more article:
... WHERE b in (All-in-a) and C not NULL and d= ' Hello '
The index will not be used because the leading column is not used, and the query will be a full table query.
In fact, in addition to the index, there are many factors affecting the performance of the query, such as table connections, whether sorting. The overall performance of the database operation needs to be considered for more factors, using a more right technique, which has to be said to be a great learning.
Finally, use SQLite on Android to write a simple example of how the index affects database operations.
Create the following tables and indexes:
Db.execsql ("CREATE table if not exists T1 (A, b)");
Db.execsql ("CREATE index if not exists IA on T1 (A, b)");
Insert 100,000 data, and do the following for each table:
SELECT * from t1 where a= ' 90012 '
Insert: INSERT INTO T1 (A, B) values (' 10008 ', ' name1.6982235534984673 ')
Updated: Update t1 set b= ' name1.999999 ' where a = ' 887 '
Delete: Delete from t1 where a = ' 1010 '
The data is as follows (5 different operations averaging):
Operations are indexed without indexes
Query 170ms 5ms
Inserting 65ms 75ms
Update 240ms 52ms
Delete 234ms 78ms
You can see a significant increase in query speed, a slight slowdown in insertion speed, and a slight increase in the speed of updating data and deleting data. If you change the column in the WHERE clause in the update and delete to B, the speed is the same as no index because the index is invalid. Therefore, the index can greatly increase the query speed, for delete and update operations, if the column in the WHERE clause uses an index, even if the index needs to be re-build, it is likely that the speed is faster than not using the index. The index is obviously a burden on the insert operation. At the same time, the index increases the size of the db by more than twice times.
There's another one to spit on, Android Rawqurey method, after executing the SQL statement returned a cursor, actually did not complete a query operation, I calculate the query time before and after Rawquery, Forever is 1ms ... It makes me very depressed. Looked at the source, when the cursor calls MoveToNext these move cursor methods, will eventually call the GetCount method, and the GetCount method calls the native method calls the real query operation. This design is clearly more reasonable.
Use of indexes in SQLite database, advantages and disadvantages of indexes [Goto]