JMS Learning (vii) KAHADB storage for persistent storage of-ACTIVEMQ messages

Source: Internet
Author: User

First, introduce

Since ActiveMQ5.4, KAHADB has become the default persistent storage method for ACTIVEMQ. Compared to the original AMQ storage, the official claims that KAHADB uses fewer file descriptors and provides a faster storage recovery mechanism.

Second, KAHADB storage configuration

The configuration in Conf/activemq.xml is as follows:

<BrokerBrokername= "Broker" ... >   <Persistenceadapter>     < kahadb Directory= "Activemq-data"Journalmaxfilelength= "32MB"/>   </Persistenceadapter>   ...</Broker>

The KAHADB is specified in <persistenceAdapter> and indicates that the data is stored in the "Activemq-data" directory, and the maximum length of the log file is 32MB.

For example, an actual ACTIVEMQ data directory under the KAHADB storage mode is as follows:

As you can see, there are altogether four files in the directory above:

①db.data

It is the index file for the message. Essentially a b-tree implementation that uses B-tree as an index to point to messages stored in Db-*.log.

②db.redo

Used primarily for message recovery.

③db-*.log stores the contents of a message. For a message, there is not only the data of the message itself, but also (destinations, subscription relationships, transactions ...).

message Data and all of the information about destinations, subscriptions, transactions, etc.

Data log stores messages as logs, and new data is always appended to the end of the log file in a append manner . Therefore, the storage of messages is very fast. For example, for persistent messages, producer sends the message to Broker,broker first to store the message on disk (enablejournaldisksyncs configuration option ). Then return the acknowledge to producer. The Append method reduces the time that broker returns acknowledge to producer to a certain extent.

④lock file

In addition, some of the configuration options for KAHADB are as follows:

1) Indexwritebatchsize default value of 1000, when the updated index in metadata cache reaches 1000, it is synchronized to the metadata store on disk. instead of writing the disk every time the update is written, the bulk of the write disk is updated, and the cost of comparing write disks is significant.

2) Indexcachesize Default value 10000, (number of index pages cached in memory), allocate up to multiple pages in memory to cache index. The more index the cache, the greater the probability of a hit, and the higher the efficiency of the retrieval.

3) Journalmaxfilelength default value of 32MB, when the stored message reaches 32MB, create a new file to save the message. This configuration has an effect on the rate of the producer or the person who has the message. For example, if the producer rate is fast and the consumer rate is slow, it is better to configure it a bit larger.

4) Enablejournaldisksyncs Default value True, the default is synchronous write disk, that is, the message is first stored on disk and then returned to producer ACK

Normally,the Broker performs a disk sync (ensuring that a message have been physically written to disk) before sending the AC K back to a producer

5) CleanupInterval default value of 30000ms, when the message is successfully consumed by the message, the broker can delete the message.

6) Checkpointinterval default value of 5s, update the in-memory index (Metadata Cache) to the index file of the disk (Metadata Store) every 5s

Third, KAHADB storage low-level implementation simple analysis

is the architecture of KAHADB:

As you can see, the sections in the diagram correspond to the one by one files in the storage directory of the KAHADB configuration.

The part of ① in memory (cache) B-tree is metadata cache

By caching the index into memory, you can speed up the query (quick retrival of message data). However, you need to synchronize the Metadata Cache with the Metadata store on a timed basis.

This synchronization process is called: Check point. the checkpointinterval option determines how often the checkpoint operation takes place.

②btree indexes is stored on disk, called the metadata Store, which corresponds to the file Db.data, which is the data logs in the form of a B-tree index.

With it, the Broker (message server) can quickly restart recovery because it is the index of the message, and it can recover the location of each message.

If the metadata store is damaged, you can only scan the entire data logs to rebuild the B-tree, which is very complex and slow.

The presence of the metadata store, however, enables the broker instance to restart rapidly. If the metadata store got damaged or is accidentally deleted, the broker could recover by reading the data logs,but the R Estart would then take a considerable length of time.

③data logs corresponds to file Db-*.log, default is 32MB

Data logs stores messages in log form, which is the real carrier for the production of producers.

The data logs is used to store data in the form of journals, where events of all kinds-messages, acknowledgments, SUBSCRI ptions, subscription cancellations, transaction boundaries, etc.---is stored in a rolling log

The ④redo log corresponds to the file Db.redo

The principle of redo log uses "Double Write". For "Double Write" refer to

Briefly record your understanding: Because the page size of the disk is different from the page size of the operating system, the disk's page size is typically 16KB, and the OS page size is 4KB. The data written to disk is in the disk page size, that is, write one disk page size at a time, which requires 4 OS page size (4*4=16). If a failure occurs during writing (sudden power loss), only part of the data is written (partial page write)

After "Double write" is used, the data is written to a recovery buffer and then written to the actual destination file. In the ACTIVEMQ source code Pagefile.java has the corresponding implementation.

Four, reference documents

KAHADB Storage Engine Analysis for ActiveMQ

"ActiveMQ Tuning" kahadb optimization

KAHADB Overview

JMS Learning (vii) KAHADB storage for persistent storage of-ACTIVEMQ messages

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.