Kafka file storage Mechanisms those things

Last Update:2018-07-26 Source: Internet

Author: User

Tags crc32 message queue

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

What is Kafka?

Kafka, originally developed by LinkedIn, is a distributed, partitioned, multi-replica, multi-subscriber, zookeeper-coordinated distributed log system (also known as an MQ system) that can be used for Web/nginx logs, access logs, messaging services, etc. LinkedIn contributed to the Apache Foundation and became the top open source project in 2010. 1. Preface

The performance of a commercial message queue is good or bad, and its file storage mechanism is designed to measure a Message Queuing service level and one of the most important indicators.
The following will be from the Kafka file storage mechanism and physical structure, the analysis of how Kafka to achieve efficient file storage, and practical application effect. 2.Kafka file storage mechanism

Kafka some nouns are explained as follows: Broker: Message middleware handles nodes, a Kafka node is a broker, and multiple brokers can form a Kafka cluster. Topic: A class of messages, such as Page view logs, click Logs, and so on can exist as Topic, Kafka clusters can be responsible for the distribution of multiple Topic at the same time. Partition:topic A physical grouping, a topic can be divided into multiple Partition, each Partition an ordered queue. The segment:partition is physically composed of multiple Segment, which are described in detail in 2.2 and 2.3 below. Offset: Each partition consists of a series of ordered, immutable messages that are appended to the partition consecutively. Each message in the partition has a sequential sequence number called offset, which is used to uniquely identify a message partition.

The analysis process is divided into the following 4 steps: topic in partition Storage distribution Partiton file storage in Partiton segment file storage structure in partition how to find message via offset

Through detailed analysis of the above 4 processes, we can clearly understand the mystery of the Kafka file storage mechanism. partition Storage distribution in 2.1 topic

Suppose the Kafka cluster in the experimental environment has only one broker,xxx/message-folder to store the root directory for the data file, server.properties the file configuration in Kafka broker (parameter log.dirs=xxx/ Message-folder), for example, create 2 topic names of Report_push, Launch_info, partitions quantity partitions=4
The storage path and directory rules are:
Xxx/message-folder

              |--report_push-0
              |--report_push-1
              |--report_push-2
              |--report_push-3
              |--launch_info-0
              |-- Launch_info-1
              |--launch_info-2
              |--launch_info-3

In the Kafka file store, there are several different partition under the same topic, each partition a directory, Partiton naming rules topic name + ordered ordinal number, the first Partiton sequence number starting from 0, The maximum number of partitions is minus 1.
If it is a multi-broker distribution, refer to the Kafka cluster partition distribution principle Analysis 2.2 Partiton file storage method

The following schematic image illustrates how the file is stored in the partition:

                              Figure 1

Each partion (directory) is equivalent to a huge file that is evenly distributed across multiple equal segment (segment) data files. However, the number of segment file messages per segment is not necessarily equal, and this feature facilitates the quick deletion of old segment file. Each partiton only needs to support sequential read and write, and the segment file lifecycle is determined by the server configuration parameters.

The advantage of this is that you can quickly delete useless files and effectively improve disk utilization. 2.3 Segment file storage structure in Partiton

The reader has learned from section 2.2 How the Kafka file system partition stored, and this section delves into the composition and physical structure of segment files in Partion. Segment file composition: consists of 2 large parts, respectively, the index file and the data file, this 2 file one by one corresponds to, in pairs appear, suffix ". Index" and ". Log" are respectively represented as segment index files, data files. Segment file naming rules: Partion The first segment of the global, starting with 0, each subsequent segment file name is the offset value of the last message in the previous segment file. The value is a maximum of 64 bits long, a 19-digit character length, and no number is filled with 0.

The following list of files is an experiment done by the author on Kafka broker, creating a topicxxx containing 1 partition, setting each segment size to 500MB, and starting producer writing large amounts of data to Kafka broker, The segment file list shown in Figure 2 below illustrates the above 2 rules:

            Figure 2

As an example of a pair of segment file files in Figure 2 above, the physical structure of Index<-->data file correspondence in segment is as follows:

            Figure 3

The index file in Figure 3 above stores a large amount of metadata, the data file stores a large number of messages, and the metadata in the index file points to the physical offset address of the message in the corresponding data file.
It takes metadata 3,497 in the index file as an example, and sequentially represents the 3rd message in the data file (the global Partiton represents the NO. 368772 message), and the physical offset address of the message is 497.

From Figure 3 above, it is understood that the segment data file consists of a number of message, and the following details the physical structure of the message as follows:

           Figure 4

parameter Description:

Key Words	Explanatory notes
8 byte offset	Each message within the Parition (partition) has an ordered ID number called offset, which uniquely determines the location of each message within the Parition (partition). That is, offset represents the number of partiion of the message
4 byte message size	Message size
4 byte CRC32	Verifying a message with CRC32
1 byte "Magic"	Indicates the release Kafka service protocol version number
1 byte "Attributes"	Expressed as a standalone version, or an identity compression type, or encoding type.
4 byte key length	Indicates the length of key, when key is-1, the K-byte key field is not filled
K byte key	Options available
Value bytes Payload	Represents the actual message data.

2.4 How to find message via offset in partition

For example, reading the offset=368776 message needs to be found in the following 2 steps.

The first step is to find the segment file
In Figure 2 above, where 00000000000000000000.index represents the beginning of the file, the starting offset (offset) is 0. The second file, 00000000000000368769.index, has a message volume starting offset of 368770 = 368769 + 1. Similarly, the starting offset for the third file 00000000000000737337.index is 737338=737337 + 1, and other subsequent files are named and sorted at the starting offset, as long as they are found based on the offset * * * File list, you can quickly locate the specific file.
When offset=368776 is positioned to 00000000000000368769.index|log

The second step is to find the message by Segment file
The first step is to locate the segment file, when offset=368776, Navigate to the 00000000000000368769.index physical location of the metadata and the physical offset address of 00000000000000368769.log, and then find it in 00000000000000368769.log order until offset =368776 so far.

From the above Figure 3 shows the advantages of this, segment index file to take a sparse index storage, it reduces the size of index files, through mmap can direct memory operation, sparse index for each corresponding message of the data file set a metadata pointer, It saves more storage space than dense indexes, but it takes more time to find them. 3 Kafka file storage mechanism – Actual run effect

Experimental environment: Kafka cluster: from 2 virtual units into Cpu:4 nuclear physics Memory: 8GB network card: Gigabit NIC JVM heap:4gb verbose Kafka server configuration and optimization please refer to: Kafka server.properties configuration detailed

                              Figure 5

As can be seen from Figure 5 above, the Kafka runtime rarely has a large number of read disk operations, mainly regular bulk write disk operations, so the operation of the disk is very efficient. This is closely related to the design of read and write message in the Kafka file store. Kafka read-write message has the following characteristics:

Write message messages are transferred from the Java heap to page cache (that is, physical memory). The message is brushed from the page cache by the asynchronous thread brush disk.

Read message messages are sent directly from the page cache to the socket. When no data is found from the page cache, disk IO is generated, from the magnetic
Disk load message to page cache, and then send it directly from the socket 4. Summary

Kafka efficient file Storage design features Kafka a parition large file in topic into a number of small file segments, through a number of small file segments, it is easy to periodically clear or delete already consumed files, reduce disk occupancy. The index information allows you to quickly position the message and determine the maximum size of the response. By mapping all the index metadata to memory, you can avoid the IO disk operation of the segment file. By using index file sparse storage, you can significantly reduce the size of the index file metadata footprint. Reference

1.Linux Page Cache mechanism
2.Kafka Official documentation

From:http://tech.meituan.com/kafka-fs-design-theory.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More