In-memory Columnstore vs. Buffer Cache

Source: Internet
Author: User

The Oracle DB 12c in-memory option (DBIM) loads data from all rows of columns in the table into memory, so why not just place frequently accessed chunks of data into memory like buffer cache?

In-memory columnstore and buffer cache access modes

The reason is that the access modes supported by the two are different, for the buffer cache, an OLTP application is supported, the access mode is Non-uniform access patterns, which means that some rows in the table are accessed more frequently than other rows, so that only 10% of the data can be cached. Can cover 95% of data access. You can assume that 10% of the data is cached for 20 times times the performance gain.

While in-memory Columnstore supports analytic applications, it accesses a few columns, but it needs to scan the data for all rows in the table. Caching some rows of data makes little sense, for example, if the memory columnstore can get a 100 times-fold performance boost, if only 10% of the data in the cache table is only 1.1 times times the performance gain, not 100 times times. So in Dbim settings, you can specify a full table, a partial column in a table, a partial partition, a table space, but you cannot specify a partial row in a cache column with a where condition.

So for analytical applications, the most important reason for memory-columnstore is faster than row-based storage (even if it is cached all through the ALTER TABLE tablename cache) is that the Columnstore format is well suited for analytic applications.

Column storage format

The following figure illustrates why Columnstore is suitable for analysis.
If you use traditional row-type storage for analysis, such as querying the 4th column, you need to row-by-line access, you need to query the 1th to 3rd column of these unrelated data.

If you are using Columnstore, you only need to access the 4th column to avoid invalid I/O, and the efficiency naturally increases.

Take a look at the test results that Oracle released on Open World 2013:

Row and column are in memory, Dbim nearly 800 times times faster, a single core processing 3 billion rows of data every 1/6 seconds, unbelievable?!

Simd

The previous technology used in high-performance computing and image processing, that is, single instruction multiple data, is actually a batch processing of the information, except that it is well suited for column-based data.

Storage Index

Storage index In fact in the exadata, in fact, is the column partition for IMCU, pre-calculation and real-time maintenance of each IMCU the maximum and minimum values, query matching where conditions, you can skip a lot of unrelated imcu, thus saving I/O and time. The principle and partitioning are similar.
However, a recalculation is required after the database restarts.

Compression

Columnstore is usually compressed, because there are more data duplicates, and compression in Dbim is the default option.
Compression not only allows more data to be cached in memory, but also reduces I/O. However, consider that if there is more access to OLTP, then do not choose to compress the high compression mode, so as not to compress and decompress the excessive consumption of resources.

Optimization of in-memory joins and aggregation

Converting a join to a column scan with Bloom filter can speed up join, especially in memory.
With the key vector, the principle is similar to the Bloom filter, and the results of the aggregation table can be built on-line, with the specific principles of the White Paper.

Reference

In-memory Column Store versus the Buffer Cache

In-memory Columnstore vs. Buffer Cache

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.