Structure and features of Bitmap indexes in massive database solutions

Source: Internet
Author: User

2.2.2 structure and features of Bitmap indexes
The structure of the Base block and branch block of the bitmap index is exactly the same as that of the B-tree index, but the leaf block and the B-tree index are different. The specific structure is 2-8.

■ Figure 2-8
Assume that the storage order of the "color" column values in each data block is "yellow, green, red, null". If you create a bitmap index based on this column, the values of this column in the leaf block must be converted to bits. As shown in the upper-right corner of 2-8, the bitmap is converted using the values table, that is, the rows in the table represent the bits corresponding to the values of each column, and the columns in the table represent the column values corresponding to each row.
From Figure 2-8, we can clearly see the bitmap Conversion Method of column values. Now we can observe the specific conversion method of the bitmap based on the behavior units in the orders table. The bitwise (BIT) is 1 only when the column value overlaps with the corresponding row, and the rest are 0, for example, if the column value is "yellow", the result is "1000". If the column value is "red", the result is "0010 ". This is just a conceptual expression. In fact, when querying values that meet the query conditions, you only need to read the corresponding "color" leaf block from the branch block. If you want to read the leaf blocks of two colors due to or, you only need to read the leaf blocks of the two colors that meet the conditions in the same way.
Bit storage is based on the "rowid range (start rowid ~ End rowid), so it can also be considered as a concept of compression. Key compression can save a lot of storage space. In addition, column values are not directly stored, but "1" is stored only when valid values exist. Otherwise, all values are replaced by "0.
In addition, when performing a data query, because it can directly convert the final bit that meets all query conditions to rowid, therefore, unlike B-tree indexes, each index row has a rowid. Although Bitmap indexes have advantages that cannot be compared with other indexes, they are not omnipotent. Only when a bitmap index is created based on columns with low discretization and repeated values of the same value, in order to save storage space to a large extent, for example, creating a bitmap index for a table with 10 million rows of data can ensure that the occupied space cannot exceed 10 MB.
When creating an index based on columns with poor discretization, the B-tree index needs to combine multiple columns to create a composite index to improve the index efficiency, while the bitmap index does not. This is not needed because it supports set operations, that is, after reading the final result from each individual bitmap index, then, we can perform a set operation to obtain the same effect as the composite index. Now let's take a look at bitmap set operations.
As shown in 2-9, in the in-place or operation, the result of 1 is both 1 and 0 only when both are 0. In the operation, 0: whether it is an operation with 1 or 0, the result is 0. The result is 1 only when both are 1. Because the bitwise operation result is actually the combination of query conditions, if the bitwise operation result is found, it means that the data row that meets the condition is found. Bitmap indexes support bitwise operations, which make the original complex "or" operations much easier.

■ Figure 2-9
From the lower-right corner of Figure 2-8, we can see that the null value is also converted into a bit. From the perspective of bit, if it is null, it is 1, and if it is not null, It is 0. It can be seen that the null problem that B-tree cannot solve is also completely solved in the bitmap index. Similarly, even if "not" exists, you only need to change 1 in the bit to 0 and 0 to 1. Its existence does not negatively affect the performance of the bitmap index.
However, when the query conditions use not equal comparison operators "=", but like, between,>, =, because of the existence of these constraints, this allows Bitmap indexes to be used only in data warehouses. Of course, bitmap indexes can also be used flexibly in OLTP statistics or summary processing. In addition, for code tables with a small amount of data, when users have diverse query requirements and frequently need to query data from the table, although bitmap indexing has a certain burden in the OLTP business, it is barely acceptable.
In a partitioned table, bitmap indexes can be used only in local indexes.
2.2.3 reading Bitmap indexes
When a bitmap index is used to read data from a table, it first reads all the bits that meet the individual query conditions, then performs and operations on the data, and finally filters out the data that meets the conditions. As shown in 2-10, "010010" is the final bit after calculation, converts it to rowid, and then reads the corresponding data row from the table using this rowid.

■ Figure 2-10
The following figure shows the execution plan of the query steps.
Execution Plan
0 SELECT statement
1 0 sort (aggregate)
2 1 Table Access (by index rowid) of 'parts'
3 2 bitmap conversion (to rowid)
4 3 bitmap and
5 4 bitmap index (single value) of 'color _ bix'
6 4 bitmap index (single value) of 'size _ bix'
As shown in the execution plan, the bitmap that meets the two conditions is read from the two Bitmap indexes, and the bitmap that meets the two query conditions is filtered out through the and operation, convert these bitmaps to rowid (Bitmap conversion). Then, read and sort the corresponding data rows from the table based on the rowid. Table 2-1 details each unit operation in the execution plan.
If you sort and combine the unit execution plans described in Table 2-1 in different ways, you can get a variety of execution plans. Some basic types are described in detail. To better understand the differences between B-tree indexes and bitmap indexes, Let's first look at the SQL statements listed in Table 2-1.
Table 2-1 SQL statements

Option det. Option description
Bitmap
Conversion to rowids convert bitmap to rowid to read Table Data
From rowids rowid to bitmap
Count does not need the actual column value but only reads the number of rowids that meet the condition.
Bitmap index single value queries the bitmap corresponding to a key value in the index Block
Query of multiple bitmaps corresponding to one key value in range scan
Full scan: Overall bitmap scan without providing the start/end value
Bitmap merge-merge the bitmap obtained by range scan
Bitmap minus-negative operation or set Difference Operation
Bitmap or-performs or operations on two bitmap Sets
Bitmap and-perform and operate on two bitmap Sets
Bitmap key
Iteration-perform a continuous scan on the rows obtained from a table using the specified bitmap index until a qualified bitmap is found. This is the type shown in star transformation.
Figure 2-11 describes the use of the "not" query conditions that cannot be used in the B-tree index in the bitmap index, and the "or" operation on it.

Read the bitmap corresponding to 123 from the index of col1, read the bitmap corresponding to ABC from the index of col2, and then perform bitmap minus operation on the two ).
Read the bitmap whose column value is null from the index of col2, And then subtract the bitmap whose col2 is null from the calculation result. Note that when using a column in a query condition, all null values must be excluded unconditionally. Therefore, to satisfy the condition col2 'abc', you must exclude the values of col2 as 'abc' and null.
The bitmap or operation is performed on the col3 pair of the query condition and the obtained result. The final bitmap that meets all the query conditions is obtained, and the bitmap is converted to the rowid, finally, the rowid is used to read the data in the table.

 

Title: massive database solutions

Author :[Han] Li Hua Zhi

Translator: translated by Zheng Baoguo Qiang

ISBN 978-7-121-11883-8

Publication date: 2011January 1, January

Pricing: 69.00 RMB

Start: 16Open

Page number: 460Page

Publicity

Classic RDBMS books covering the latest core technologies of database experts

It contains an advanced method to reduce the code to 1/10 times and increase the speed to 10 times.

It unveiled the true nature of relational databases and demonstrated the new technologies that have not been flexibly used so far.

Description

This book divides the entire content into two parts. In Part 1, all the elements that affect data reading efficiency are classified into their own concepts, principles, features, and application principles, the table's structural features, diversified index types, the internal function of the optimizer, And the execution plans developed by the optimizer for various results are described in detail, based on the correct understanding of the optimizer, the index building strategy scheme that has the greatest impact on the execution plan and speed is proposed; in Part 1, we will mainly introduce specific strategic solutions to improve data reading efficiency. In this section, we will introduce the principle and specific application methods of local range scanning related to data reading efficiency, all types of table connections that are considered to be the basis for improving database usage efficiency are described in detail.

The series of massive database solutions have been favored by many readers for more than 10 years. While being hailed as the "Bible", it has become an indispensable book for database users. The author tries his best to find a clever method that allows IT workers to easily apply and control in their actual work, and provides a solution to a massive database with twice the result with half the effort.

This book is suitable for database developers and database administrators.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.