Oracle Exadata All-in-One and cloud computing applications (2), oracleexadata

Source: Internet
Author: User

Oracle Exadata All-in-One and cloud computing applications (2), oracleexadata
3. Main Exadata Technologies

3.1 unit shunting

In the database cloud server, storage is not just a dumping storage. Storage units are smart enough to handle certain loads internally, which can reduce the workload of database nodes. This process is called unit shunting.

3.2 smart Scanning

Smart scanning is one of the most important functions of Exadata. It is used to run SQL statements on each Cell, and then each Cell returns only qualified data to the database, this greatly reduces the load and network traffic of database servers, and makes full use of Cell computing and IO resources.
In traditional Oracle databases, when a single column in a row or even a row is selected, the entire data block containing the row will be read from the disk to the buffer cache, then extract the selected row (in some cases a column) from the data block and provide it to the user session. In the database cloud server, most types of access use this process, but there are several very important types. Direct path access (such as full table scan and full index scan) is different. The database server can extract specific rows (or columns) from the disk and send them to the database node. This function is called "intelligent scanning" and can greatly reduce I/O.
For example, there are 1 billion rows of data, of which only 1,000 rows match your query, but full table scan in traditional databases will retrieve all data blocks and filter out these rows. However, smart scanning will only extract the 1,000 rows (or even specific columns in these rows, if requested)-this may reduce I/O by 10 million times! Unit shunting supports units to perform this operation.


Note that when using smart scanning, each Cell returns a result set to the DB Server instead of a traditional Block. The database Server processes the result set, and return it to the client.

3.3 How to Implement smart scanning of storage indexes by sending only relevant rows and columns instead of blocks? This is due to the special data structure built on the internal data mode of the storage unit. For a specific segment, the minimum and maximum values of all columns in the segment and whether null values exist are stored in the specified area on the disk (usually 1 MB ). This data structure is called a storage index. When a unit receives a query that supports intelligent scanning from a database node, it checks which areas of the storage do not contain the corresponding data. For example, if the query predicate declares where B = 1, then the minimum and maximum values of Column B on the disk are 3 and 8 respectively, and there will certainly be no rows matching the predicate. Therefore, the Unit skips the part that does not read the disk. By checking the storage index, the Unit can exclude many regions that do not contain this value, thus reducing a large amount of I/O. Although the stored index has the word "Index" in its name, it is completely different from a common index. A common index is used to target the location where rows are most likely to be located. A storage index is used for the opposite purpose-it focuses on the location where rows are least likely to be located. In addition, unlike other segments, these segments are not stored on the disk, but reside in the memory.
3.4 hybrid column compression first we need to figure out what is row compression and what is column compression. The databases we are familiar with, such as Oracle and MySQL, are row-based databases, that is, the different fields of rows are physically stored together, and the other is column-based databases, that is, different rows of each field are physically stored together. Their advantages and disadvantages are equally prominent:
It is convenient to access a row in a row-based database. However, data in the same column is stored separately. If you want to query a column, you need to scan the entire table to obtain the results. Row-based database compression is called row compression.
Because the data in the same column is physically put together in a column-based database, it is very convenient to access a column. That is to say, you do not need to scan the entire table when querying a column, you only need to scan the data in this column, but it is inconvenient to access all the fields in a row. Column database-based compression is called column compression.

Generally, the compress function (including the Advanced compress of 11g R2) in Oracle is row compression, because Oracle is a row-based database. The general method is to store a symbol table in the block header and put the same value there. The same data on each line points to the symbol table to compress the table. Row compression is usually ineffective because we know that there are not many identical data between rows. But column compression is different, because the data types of the same column are the same, it is easy to achieve a good compression effect.


Row compression and column compression have their advantages and disadvantages. Oracle's hybrid column compression technology actually integrates the high compression ratio of column compression and the access characteristics of row databases, combine the advantages of the two. Oracle puts forward the concept of CU (compress unit). In a CU, It is a column-based storage method that adopts column compression, but a CU stores all the field information of the row, therefore, between cu and cu, Oracle is a row-based database that accesses a row and is always only in one CU (a CU is always in one block ).


As mentioned above, a new structure called compression unit (cu) is introduced to store a set of rows compressed by hybrid columns. After the new data is loaded, the column values are appended to the end of the existing row set, sorted, grouped, and compressed. After these actions are completed, a compression unit is formed. Simply put, the column store is segmented, and the compression unit is used to maintain the relationship between different segments. Specifically, you need to use Bulk Loading to apply DML operations to stored data. The Exadata engine adopts the on-demand decompression policy for storing data.

This is a completely different method from the conventional Oracle database engine. For data warehouse compression and archive compression, it seems that different compression density is set for different scenarios. The reason why Oracle emphasizes the Exadata compression capability is that Exadata currently imposes storage capacity and price restrictions.
Therefore, hybrid column compression combines column compression and row access features to provide a very high compression rate and ensure row-based access. The following section describes how to use hybrid column compression in the actual environment of Oracle EBS:

3.5 smart Cache

Data blocks are stored in the database buffer cache before being sent to the end user. If data is found here, you do not have to access the storage. However, if I/O is not found (this may happen in large databases), I/O is unavoidable. In the database cloud server, there can be a second-level cache between the database buffer cache and storage, known as "intelligent cache ". The smart cache stores frequently accessed data. The data in the cache may meet the requirements of database nodes, so it may not need to be transferred to the disk to improve performance.
Oracle databases work closely with Exadata to cache frequently accessed data. When a database sends a read or write request to the Exadata Storage Server, the request contains additional information, indicating whether the database may read the data again and whether the data should be cached. Exadata Storage Server Software intelligently determines which data will be read again based on the information sent by the database, so it is worth caching, and which data does not need to be cached. Random reading of tables and indexes is likely to result in subsequent read operations. Therefore, data is usually cached and then provided from the flash cache. Scanning (that is, reading tables sequentially) usually does not cache the data, because the data accessed in order is unlikely to have the same subsequent read operations. Generally, write operations directly write data to the disk. If the software determines that the data will be read later, the data will be temporarily put into the cache.
Knowing which data is not cached is very important for achieving cache performance. For example, when data blocks are overwritten, backed up, or mirrored, the software will avoid caching these blocks. Because these blocks will not be read again in the near future, there is no need to use valuable cache space on these objects or blocks. Only the Oracle database and Exadata Storage Server Software have this visibility and can understand the nature of all I/O operations in the system. With visibility into all I/O operations, you can optimize the use of the Exadata smart flash cache and only store frequently accessed data. All operations of this function are automatically performed without customer configuration or adjustment, and Exadata smart flash cache is fully utilized in most cases.
Exadata smart flash cache is the driving force behind the OLTP function of Exadata Database Machine. It provides unprecedented IOPS for the most demanding database applications and more than twice the scanning speed for data or reporting applications in the repository, and provides special support for the key database logging function. By understanding the data to be cached and how to automatically manage the cache, the Oracle Database combined with the Exadata smart flash cache is the first and only Database on the market that supports flash memory.


3.5 I/O Resource Management

When using traditional storage, it is difficult to determine the priorities of different jobs and users who use the I/O bandwidth in the storage subsystem, which hinders the establishment of a shared storage grid. This also happens when multiple databases share the storage subsystem. When using Exadata storage, the DBRM and I/O resource management functions of Exadata storage can prevent one type of work or one database from occupying disk resources and bandwidth exclusively, and ensure that they meet user-defined SLA. DBRM can coordinate the I/O bandwidth used between databases and between different users and various types of work and determine the priority. By tightly integrating the database with the storage environment, the database cloud server can know the type of work and the amount of I/O bandwidth used. Therefore, you can let the database ECs system identify various types of loads, specify priorities for these loads, and ensure that the most critical loads are prioritized.
In a data warehousing or hybrid load environment, you may want to ensure that different users and tasks in the database are allocated a relatively small amount of I/O resources. For example, you may want to allocate 70% of I/O resources to interactive users on the system and 30% of I/O resources to batch report jobs. This can be easily implemented using the DBRM and I/O resource management functions of Exadata storage.
The database ECs administrator can create a resource plan to specify how to determine the priority of the I/O Request. This can be achieved by putting different types of work into different service groups (called user groups. A user group can be defined by multiple attributes, including the user name, client program name, function, or query duration. After defining these user groups, you can set a hierarchy to specify which user group has the priority of I/O resources and the amount of I/O resources allocated to each user group. The I/O resource priorities determined by such hierarchies can be applied to intra-database operations (that is, operations performed in the database) and Inter-Database Operations (that is, operations performed between different databases ).
When multiple databases share Exadata storage, you can determine the priority of the I/O resources allocated to each database to prevent a database from occupying the disk resources and bandwidth exclusively, to ensure that user-defined SLA is met.


In fact, Exadata I/O Resource Manager solves a challenge that traditional storage technologies cannot cope with: It can balance the work of multiple databases and users in the shared storage subsystem and determine their priorities, to create a shared grid storage environment. Database ECs I/O Resource Management ensures that multiple databases with shared Exadata storage meet the User-Defined SLA. This ensures that each database or user can obtain an appropriate amount of disk bandwidth to meet business objectives.

3.6 Exadata ultimate performance Elasticity and ultimate Performance Assurance:
  • Cloud architecture: supports parallel and horizontal scaling at the computing, storage, and network layers.
  • Storage index: Data-aware Dynamic Memory Index
  • Intelligent scanning: the query load is sunk to the storage layer.
  • Smart flash memory: supports Real-Time Random I/O Storage
  • Data Compression: optimized for OLTP, data warehouse, and archive
  • Infiniband: network supporting Big Data Transmission
  • Balanced design: barrel Effect
Resource pooling:
  • Infiniband Partition: safer and more efficient hardware network Partition
  • ASM (Automatic Storage Management) storage shared by all databases
  • RAC (real application cluster) achieves large database sharing through multiple nodes
  • QoS and IORM (I/O Resource Management) Manage server and storage resources based on the customer-defined SLAs
  • Instance Caging Instance locking allows multiple databases to share resources in one node
  • The deployment is simple and fast. The time window for batch processing such as data loading and backup is significantly reduced. Typical Case: after a large bank migrates data to Exadata, the backup time is increased from 12 hours to 1.5 hours.
  • Query performance can generally be improved by 20 ~ 30x
  • OLTP application performance can be improved by 8 ~ 10 times
  • The hybrid column compression technology makes the data warehouse's compression ratio usually 10 times, and the OLTP compression ratio is usually 2 ~ 4x
  • Reduces Data room occupation and saves energy, usually 1/4 of the original system area and 1/8 of energy
  • This greatly improves the performance, reduces the use of indexes, saves database space, and reduces the complexity of management.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.