In-memory columnar storage vsBufferCache

Last Update:2018-05-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The In-Memory option (DBIM) of OracleDB12c loads data from all the rows In the table into the Memory. Why can't I just put frequently accessed data blocks into the Memory like BufferCache? The access modes of memory columnar storage and BufferCache are different because they support different access modes. For BufferCache, OLTP applications are supported, and the access mode is

The In-Memory option (DBIM) of Oracle DB 12c loads the data of all rows In the table into the Memory. Why can't I just put frequently accessed data blocks into the Memory like the Buffer Cache? The access modes of memory columnar storage and Buffer Cache are different because they support different access modes. For Buffer Cache, OLTP applications are supported, and the access mode is

The In-Memory option (DBIM) of Oracle DB 12c loads the data of all rows In the table into the Memory. Why can't I just put frequently accessed data blocks into the Memory like the Buffer Cache?

Access Modes of memory columnar storage and Buffer Cache

The reason is that the two support different access modes. For Buffer Cache, OLTP applications are supported. The access mode is non-uniform access patterns, which means that some rows in the table are frequently accessed than other rows, therefore, only 10% of data can be cached to cover 95% of data access. It can be assumed that 10% of the data is cached and the performance can be improved by 20 times.

The memory columnar storage supports analyticdb applications that access a few columns but need to scan the data of all rows in the table. It is of little significance to cache data of some rows. For example, if the memory columnar storage can improve performance by 100 times, if only 10% of the data in the cache table can be improved by 1.1 times, rather than 100 times. Therefore, in DBIM settings, you can specify the full table, some columns, some partitions, and tablespaces In the table. However, you cannot use the where condition to specify only some rows in the cache column.

Therefore, for analytical applications, memory columnar storage is more than Row-based storage (even through alter tableTablenameThe most important reason is that the column storage format is very suitable for analytical applications.

Column Storage Format

The following figure shows why columnar storage is suitable for analysis.
If you use traditional row-based storage for analysis, such as querying 4th columns, you need to access the data row by row and query 1st to 3rd columns of irrelevant data.

If column-based storage is used, you only need to access the 4th columns to avoid invalid I/O, and the efficiency is naturally improved.

Let's take a look at the test results released by Oracle on Open World 2013:

Both row and column types are in the memory. DBIM is nearly 800 times faster, and a single core processes 1/6 rows of data every 3 billion seconds. Is it incredible ?!

SIMD

The technology previously used in high-performance computing and image processing, namely Single Instruction Multiple Data, is actually a batch processing of Data, but it is very suitable for columnar Data.

Storage index

The storage index is actually available in Exadata. In fact, the column is partitioned into IMCU, and the maximum and minimum values of each IMCU are pre-calculated and maintained in real time. The where condition is matched during query, you can skip many irrelevant imcu to save I/O and time. The principle is similar to that of partitioning.
However, the database needs to be re-computed after being restarted.

Compression

Column-based storage is usually compressed because there are many duplicate data values, and compression in DBIM is the default option.
Compression not only caches more data in the memory, but also reduces I/O. However, if you have more OLTP access, do not select a compression method that is relatively high, so as to avoid excessive resource consumption during compression and decompression.

Optimization of Join and Aggregation in memory

Using the Bloom Filter to convert a Join to a column scan can speed up the Join operation, especially in the memory.
The principle of key vector is similar to that of Bloom Filter. You can also construct the results of clustering tables online. For more information, see the White Paper.

Reference

In-Memory Column Store versus the Buffer Cache

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

In-memory columnar storage vsBufferCache

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

In-memory columnar storage vsBufferCache

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support