Kafka Partition segment Log relationship

Source: Internet
Author: User

Introduction

The message in Kafka is organized in topic as the basic unit, and the different topic are independent of each other. Each topic can be divided into several different partition (each topic has several partition specified when the topic is created), and each partition stores part of the message. By borrowing an official picture, you can visually see the relationship between topic and partition.

Partition is stored in the file system as a file, for example, create a topic named Page_visits, which has 5 partition, then there are 5 directories in the Kafka data directory (specified by Log.dirs in the configuration file) : page_visits-0, Page_visits-1,page_visits-2,page_visits-3,page_visits-4, whose naming convention is <topic_name>-<partition_ ID>, the data stored in these 5 partition.

Next, this article analyzes the storage format of the files in the partition directory and where the associated code resides.

Partition data files

Each message in the partition is offset to indicate its offsets in this partition, which is not the actual storage location of the message in the partition data file, but a logical value. It uniquely identifies a message in the partition. Therefore, you can assume that offset is the ID of the message in partition. Each message in partition contains the following three properties:

    • Offset

    • Messagesize

    • Data

Where offset is long, Messagesize is Int32, which indicates how big the data is, and data is the specific content of the message. Its format is consistent with the Messageset format described in the Kafka communication protocol.

The partition data file contains several message formats in the above format, grouped by offset from small to large. Its implementation class is Filemessageset, and the class diagram is as follows:

Its main methods are as follows:

    • Append: Writes the message in the given bytebuffermessageset to this data file.

    • Searchfor: Starts the search from the specified startingposition to find the first message whose offset is greater than or equal to the specified offset and returns its position in the file position. It is implemented by reading 12 bytes starting from Startingposition, which is the current messageset offset and size, respectively. If the current offset is less than the specified offset, the position is moved backwards logoverhead+messagesize (where Logoverhead is Offset+messagesize, 12 bytes).

    • READ: The exact name should be slice, which intercepts part of it to return a new filemessageset. It does not guarantee the integrity of the intercepted location data.

    • sizeInBytes: Indicates how many bytes of space this filemessageset occupies.

    • Truncateto: Truncate This file, this method does not guarantee the integrity of the truncated location of the message.

    • Readinto: Reads the contents of the file into the corresponding Bytebuffer, starting at the specified relative position.

Let's think about what happens if a partition has only one data file.

    1. The new data is added at the end of the file (call Filemessageset's Append method), regardless of the size of the file data file, this operation is always O (1).

    2. A message that finds an offset (called the Searchfor Method of filemessageset) is searched in order. Therefore, if the data file is large, the efficiency of the lookup is low.

How does the Kafka solve the problem of finding efficiency? There are two great magic Weapons: 1) Segment 2) index.

Fragmentation of data files

One of the ways Kafka solves query efficiency is to fragment data files, such as 100 message, whose offset is from 0 to 99. Assume that the data file is divided into 5 segments, the first paragraph is 0-19, the second segment is 20-39, and so on, each segment is placed in a separate data file, and the data file is named with the smallest offset in the paragraph. In this way, when looking for a message with the specified offset, a binary lookup can be used to navigate to which segment of the message.

Index a data file

Data file segmentation allows you to find a message for offset in a smaller data file, but this still requires sequential scanning to find the message corresponding to offset. In order to further improve the efficiency of the search, Kafka created an index file for each segmented data file, and the file name is the same as the name of the data file, except that the extension is. Index.
An index file contains several index entries, each representing the index of a message in the data file. The index contains two parts (all 4-byte digits), respectively, relative offset and position.

    • Relative offset: Since the data file is segmented, the starting offset for each data file is not 0, and the relative offset indicates the size of this message relative to the smallest offset in the data file to which it belongs. For example, the offset of a data file after a fragment starts at 20, then the relative offset of the message with offset 25 in the index file is 25-20 = 5. Storing relative offset reduces the space occupied by the index file.

    • A position that represents the absolute position of the message in the data file. Just open the file and move the file pointer to this position to read the corresponding message.

Instead of indexing each message in the data file, the index file uses sparse storage to create an index for every byte of data. This prevents the index file from taking up too much space so that the index file can be kept in memory. But the downside is that a message without an index cannot be positioned at once to its location in the data file, which requires a sequential scan, but the range of sequential scans is small.

In Kafka, the implementation class for the index file is Offsetindex, and its class diagram is as follows:

The main methods are:

    • Append method, add a pair of offset and position to the index file, where offset will be converted to relative offset.

    • Lookup, using binary lookup to find the largest offset that is less than or equal to the given offset

Summary

Let's summarize how the message is stored in Kafka and how to find the message of the specified offset in a few graphs.

The message is organized according to topic, each topic can be divided into multiple partition, such as: There are 5 partition of the directory structure named Page_visits topic:

Partition are segmented, each segment is called Logsegment, including a data file and an index file, which is a file in a partition directory:

As you can see, this partition has 4 logsegment.

Borrow a picture from the blogger @lizhitao blog to show how to find the message.

For example, to find a message with an absolute offset of 7:

    1. The first is to use a binary lookup to determine which logsegment it is in, naturally in the first segment.

    2. Open the index file for this segment, and also use binary lookup to find the largest offset in the index entry with offset less than or equal to the specified offset. The index of natural offset 6 is what we're looking for, and we know from the index file that the message with offset 6 has a position of 9807 in the data file.

    3. Open the data file and start the sequential scan from the place where 9807 is located until you find the message with offset 7.

This set of mechanisms is built on offset and is orderly. The index file is mapped to memory, so the lookup is fast.

In a word, Kafka's message store uses partitioning (partition), fragmentation (logsegment), and sparse indexing to achieve high efficiency.

Reprint: Kafka Log storage parsing

Kafka Partition segment Log relationship

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.