Thinking about the relationship between the buffer mechanism of Oracle and the edit logs in HDFs

Source: Internet
Author: User

You might ask, why is there a connection between Oracle and HDFS, the storage systems that belong to different scenarios? Indeed, from the technical point of view, they are really unrelated, but using the "holistic learning " idea, out of the technology itself, you can find that Oracle's buffer and HDFS edit logs to solve the frequent IO, can solve the problem of low performance due to frequent disk read and write. As shown in the following:

I. Oracle's buffer mechanism

There are two main buffers for Oracle: The database buffer cache (data buffer caches, hereafter referred to as DB Chche) and the log buffers (redo log) (the students who are interested in Oracle instance memory structure can own Baidu, which is no longer introduced here).

1. Database buffer cache (DB cache)

The DB cache is the working area where the SQL is executed, and all operation data for the user session is done in this cache, rather than directly manipulating the disk. For example, when a select operation is performed, the DB cache is looked up first, if it is found, it is returned directly, and if it is not found, the data is then loaded from the disk into the DB cache and then returned. As an example, when the update operation is performed, the data in the DB cache is only updated. If the data in the DB cache is inconsistent with the data in the disk, this data is called dirty data.

So what is the mechanism of the DB cache's dirty data being written to disk? DBWN, a background process dedicated to this database writer in Oracle, writes data from the DB cache to disk, and Dbwn writes to disk when one of the following occurs:

1) There is no available DB Cache;

2) too much dirty data;

3) The time-out period of 3 seconds is reached;

4) Encounter checkpoint;

As you can see, dbwn is extremely lazy to write in such a way as to reduce IO.

2. Log buffers

The log buffer is a small area of memory that is used to store the change vectors in the redo log file. When a database fails, causing a large amount of dirty data to be written to the disk, recovery is based on the Redo log file ( Note: Redo Redo and undo undo are different, do not confuse ). Like Dbwn, redo log buffers also have a dedicated background process LGWR, disk write operations, write the timing of the following situations:

1) Commit operation ( Note that this commit is performed by the redo log write disk operation, rather than the write disk operation of the DB cache );

2) The log buffer has been used more than 1/3;

3) Dbwn to write dirty data ( you can think about why?) The reason is that DBWN may write uncommitted transaction data to disk, in order to ensure that the data that has been written to the disk can be rolled back, so the corresponding redo log must also be written to disk, in fact, for transaction rollback use );

In conclusion, both types of buffers are the mechanisms that appear to minimize IO.

Second, the edit logs of HDFs

The students who are familiar with the HDFs Namenode startup process should know that the fsimage and edit logs are loaded into the Namenode memory when the HDFs Namenode is started, and when the Namenode needs to be restarted, After merging the in-memory fsimage with the edit logs into the disk, repeat the above operation (where edit logs can be considered a buffer of Namenode, similar to the buffer of Oracle, It's just not easy to write a disk in general . As a result, you will face the following problems:

1) When the edit logs file is large, the restart time of the Namenode will be too long;

2) When Namenode accidentally hangs, then Eidt logs will lose many changes;

At this point, secondary namenode appears, you can query the edit logs by timing, and then update the edit logs to Fsimage, and then secondary Namenode to write the updated fsimage to the Namenode disk, To be used for the next reboot. Of course, with the advent of HA after Hadoop 2.0, secondary namenode is also replaced by other HA solutions, which have the opportunity to introduce in depth later.

In summary, Oracle's buffers are similar to the edit logs in HDFs, and are designed to prevent IO from impacting performance, except that the mechanism for writing to the disk is different, but thought can be considered consistent.

Thinking about the relationship between the buffer mechanism of Oracle and the edit logs in HDFs

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.