partition Storage distribution in topic
Topic can logically be thought of as a queue. Each consumption must specify its topic, which can be simply understood to indicate which queue to put the message in. In order to make the Kafka throughput can be scaled horizontally, the topic is physically divided into one or more partition, each partition physically corresponding to a folder, which stores all messages and index files of this partition. The Partiton naming rule is topic name + ordered ordinal, the first Partiton sequence number starts at 0, and the maximum ordinal value is partitions number minus 1.
How to store files in partition
The following image illustrates how the file is stored in the partition:
- Each partion (directory) is equivalent to a huge file that is evenly distributed across multiple equal segment (segment) data files, but each segment segment the number of file messages is not necessarily equal, the following refers to an algorithm for the number of messages, because the size of each segment is certain, However, the size of each message may vary, so the number is different.
- Each partiton only needs to support sequential read and write, the segment file lifecycle is determined by the server configuration parameters, and the benefit is the ability to quickly delete useless files and effectively improve disk utilization.
segment file storage structure in partition
As mentioned earlier, each topic is divided into multiple partition distributed to each broker, and each partition folder is composed of several smaller files.
- Segment file composition: consists of 2 large parts, respectively, the index file and the data file, this 2 file one by one corresponds to, in pairs appear, suffix ". Index" and ". Log" are respectively represented as segment index files, data files.
- Segment file naming rules: The first segment of the Partion global starts with 0, and each subsequent segment file is named the maximum offset (offset message number) of the previous global partion. The value is a maximum of 64 bits long, a 19-digit character length, and no number is filled with 0.
A pair of segment file files as an example to illustrate the physical structure of Index<-->data file correspondence relationship in segment
The index file stores a large amount of metadata, and the data file stores a large number of messages, and the metadata in the index file points to the physical offset address of the message in the corresponding data file. It takes metadata 3,497 in the index file as an example, which in turn represents the 3rd message in the data file (the global Partiton represents the NO. 368772 message), and the physical offset of the message is 497.
From the above understanding that segment data file consists of many message, the following details the physical structure of the message as follows:
Key Words |
Explanatory notes |
8 byte offset |
Each message within the Parition (partition) has an ordered ID number called offset, which uniquely determines the location of each message within the Parition (partition). That is, offset represents the number of partiion of the message |
4 byte message size |
Message size |
4 byte CRC32 |
Verifying a message with CRC32 |
1 byte "Magic" |
Indicates the release Kafka service protocol version number |
1 byte "Attributes" |
Expressed as a standalone version, or an identity compression type, or encoding type. |
4 byte key length |
Indicates the length of key, when key is-1, the K-byte key field is not filled |
K byte key |
Options available |
Value bytes Payload |
Represents the actual message data. |
how to find message by offset in partition
For example, reading the offset=368776 message needs to be found in the following 2 steps.
The first step is to find the segment file
Where 00000000000000000000.index represents the beginning of the file, the starting offset (offset) is 0. The second file, 00000000000000368769.index, has a start offset of 368770 = 368769 + 1 for the message volume. Similarly, the starting offset for the third file 00000000000000737337.index is 737338=737337 + 1, and other subsequent files are named and sorted at the starting offset, as long as the file list is found based on the offset * * binary, You can quickly navigate to a specific file.
When offset=368776 is positioned to 00000000000000368769.index|log
The second step is to find the message through the segment file by locating the segment file in the first step, and when offset=368776, navigate to the metadata physical location of 00000000000000368769.index and The physical offset address of the 00000000000000368769.log, which is then searched in 00000000000000368769.log order until offset=368776 is reached.
From the above Figure 3 shows the advantages of doing this, segment index file to take a sparse index storage, it reduces the size of index files, through Mmap can direct memory operations, sparse index for each corresponding message of the data file set a metadata pointer, it Saves more storage space than dense indexes, but it takes more time to find them.
the guarantee of high efficiency
Each message is append to the partition, sequential write to disk, the efficiency of random writes in the mechanical disk is very low, but if the sequential write efficiency is very high This is Kafka high throughput rate of a very important guarantee.
When each message is sent to the broker, it chooses which partition to store according to the paritition rule. If the partition rules are set up properly, all messages can be distributed evenly across different partition, allowing for horizontal scaling. (If a topic corresponds to a file, the machine I/O to which this file resides will become a performance bottleneck for this topic, and partition solves the problem). You can specify the number of this partition in the topic when you create it $KAFKA_HOME/config/server.properties
(see below), but you can also modify the Parition number after topic creation
# The default number of log partitions per topic. More partitions allow greater# parallelism for consumption, but this would also result in more files Acro SS# the brokers. Num.partitions=3
When sending a message, you can specify the key,producer of the message according to the key and partition mechanism to determine which parition to send the message to. The paritition mechanism can be specified by specifying the producer Paritition. class to specify that the class must implement an kafka.producer.Partitioner
interface. In this example, if key can be resolved to an integer, the corresponding integer is partition to the total number of the sum, and the message is sent to the corresponding partition of that number. (Each parition will have a serial number)
For a traditional message queue, messages that have already been consumed are generally deleted, and the Kafka cluster retains all messages, regardless of whether they are consumed or not. Of course, because of disk limitations, it is not possible to keep all of the data permanently (not really necessary), so Kafka provides two strategies to delete old data. One is based on the time and the second is based on the partition file size. For example, you can configure $KAFKA_HOME/config/server.properties
to have Kafka delete data from a week ago, or you can configure Kafka to delete old data when the partition file exceeds 1GB, as shown below.
############################# Log Retention Policy ##############################The following configurations control the disposal of log segments. The policy can#Be set to delete segments after a period of time, or after a given size has accumulated.#A segment would be deleted whenever *either* of these criteria is met. Deletion always happens#From the end of the log.#The minimum age of a log file to being eligible for deletionlog.retention.hours=168#A size-based Retention policy for logs. Segments is pruned from the log as long as the remaining#segments don ' t drop below log.retention.bytes.#log.retention.bytes=1073741824#The maximum size of a log segment file. When the this size is reached a new log segment would be created.log.segment.bytes=1073741824#The interval at which log segments is checked to see if they can be deleted according#To the retention policieslog.retention.check.interval.ms=300000#By default the log cleaner is disabled and the log retention policy would default to#just delete segments after their retention expires.#If Log.cleaner.enable=true is set the cleaner would be enabled and individual logs#can then is marked for log compaction.Log.cleaner.enable=false
It is important to note that because Kafka reads a specific message in a time complexity of O (1), which is independent of the file size, deleting the file here is not related to Kafka performance, and the choice of how to delete the policy is only relevant to the disk and the specific requirements. In addition, Kafka will retain some metadata information for each consumer group – the position of messages currently consumed, or offset. This offset is controlled by consumer. Under normal circumstances, consumer will increase this offset linearly after consuming a message. Of course, consumer can also set offset to a smaller value and re-consume some messages. Because Offet is controlled by consumer, Kafka broker is stateless, it does not need to flag which messages have been consumer, and it does not need to pass the broker to ensure the same consumer Group has only one consumer can consume a certain message, so there is no need to lock mechanism, which also provides a strong guarantee for the high throughput rate of Kafka.
Kafka Study (iv)-topic & Partition