DB2 10.5 BLU: Hadoop and memory database killer?

Source: Internet
Author: User

Developed by the IBM Research and Development Lab, BLU (the development code represents "big data, lightning fast, ultra-easy") integrates the columnar processing) technologies such as deduplication, parallel vector processing, and data compression.

BLU focuses on making the database "memory optimized", said Tim Vincent, IBM information management software CTO, "it will run in memory, but you don't have to put everything in memory ." In addition, BLU can greatly eliminate manual adjustments required to improve SQL query performance.

IBM claims that because BLU integrates a range of technologies that greatly improve analysis capabilities and simplify management, DB2 10.5 can increase data analysis speed by more than 25 times. This improvement eliminates the need for enterprises to purchase a separate memory database (such as Oracle's TimesTen) for fast data analysis and transaction processing.

On the Internet, IBM provides an example of how 32 core systems use the BLU technology to query a 10 TB data set in less than one second. "Of the 10 TB, you [may] only have 25% of the Data interactions for daily operation. You only need to keep 25% of the data in the memory, "Vincent said." You can buy a server with 1 tb ram and 5 TB solid state storage at a price less than $35,000 today ."

In addition, using DB2 can reduce the labor cost of running a separate data warehouse. The available database administrators usually have more than the data warehouse experts. In some cases, it can even be used as a data processing platform that is easier to maintain to replace Hadoop, Vincent said. The compression algorithm, one of the new technologies, can store data in this way. In some cases, data reading does not require decompression. Vincent explained that the data compression sequence is that the data is stored in the same way, which means that the predicate operation, such as adding a Where clause query, can be executed without first extracting the dataset. Data is compressed throughout the analysis process, which can greatly shorten the time required for analysis.

Another trick to save time: the Software saves a metadata table, which lists the key values of each data page or column. Therefore, when performing a query, the database can check to see if any value is found on the data page. "If the page is not in memory, we do not need to read it into memory; if it is in memory, we do not need to take it to the CPU through the bus, and burned the CPU to analyze all the values on the page, "Vincent said." This enables us to use our CPU and bandwidth more effectively." Through columnar processing, the query can only be pulled into the selected columns in the database table, rather than all rows, which will consume more memory. Vincent said: "We have developed an algorithm that is very effective for determining which columns and which scopes you want to cache in memory ."

In terms of hardware, the software comes with the ability to process parallel vectors. This is a way to send a single command to multiple processors, using the SIMD (single instruction and multiple data streams) instruction set, available on Intel and PowerPC chips. Therefore, the software can run a single query for multiple columns and analyze parallel data across processors. The system can be placed in registers. "Registers are the most efficient for system memory utilization ." Vincent said.

It is reported that in the test process, combined with the innovative BLU acceleration function, many query functions run faster than 1000 times in a single analysis load.

IBM is not alone in researching new ways to plug large databases into the server memory. Last week, Microsoft announced that its SQL Server 2014 was also equipped with a number of new technologies, collectively known as Hekaton, including maximizing the use of running memory, and the columnar processing technology that borrow the workbook Technology of Excel.

Curt Monash, a database analyst at Monash Research, pointed out that with the release of IBM DB2 10.5, Oracle is now the only mainstream Relational Database Management System (DBMS) vendor without real columnar processing capabilities.

IBM uses the BLU component of DB2 10.5 as the cornerstone of the DB2 SmartCloud infrastructure as a service (IaaS) to enhance data reporting and analysis capabilities. In addition, BLU technology can be inserted into other IBM data storage and analysis products, such as Informix.

Edit recommendations]

  1. 10 suggestions for making the database faster
  2. 20 database design best practices
  3. Use DB2 pureXML to manage Protein Data Bank
  4. A long journey to migrate DB2 to Oracle
  5. A long journey to migrate from Oracle to DB2

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.