High throughput of Kafka

Last Update:2016-03-10 Source: Internet

Author: User

Tags sendfile

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

High throughput of Kafka

As the most popular open-source message system, kafka is widely used in data buffering, asynchronous communication, collection logs, and system decoupling. Compared with other common message systems such as RocketMQ, Kafka ensures most of the functions and features while providing superb read/write performance.

This article will analyze the Kafka performance. First, we will briefly introduce the Kafka architecture and related terms:

1. Topic: logic concept used to divide messages. A Topic can be distributed across multiple brokers.
2. Partition: This is the basis for horizontal scaling and all parallelism in Kafka. Each Topic is divided into at least one Partition.
3. Offset: the number of the message in Partition. The order of the message numbers is not different from that in Partition.
4. Consumer: used to retrieve/consume messages from the Broker.
5. Producer: used to send/produce messages to the Broker.
6. Replication: Kafka supports redundant backup of messages in the unit of Partition. Each Partition can be configured with at least one Replication (when only one Replication exists, only the Partition itself ).
7. Leader: A unique Leader is selected for the Partition in each Replication set. All read/write requests are processed by the Leader. Other Replicas synchronize data updates from the Leader to the local machine. The process is similar to Binlog synchronization in MySQL.
8. Broker: In Kafka, the Broker is used to accept the requests of Producer and Consumer, and the Message is persisted to the local disk. In each Cluster, a Broker is elected as the Controller to handle the Leader Election of Partition and coordinate the Partition migration.
9. ISR (In-Sync Replica): a subset of Replicas, indicating the current Alive Replicas set that can be "Catch-up" with the Leader. Since the read and write operations are first performed on the Leader, the Replica pulling data from the Leader through the synchronization mechanism will have some latency (including the delay time and the number of delay items ), this Replica will be kicked out of ISR when any of the above thresholds is exceeded. Each Partition has its own independent ISR.

The above are almost all the terms we may encounter when using Kafka, and there are no core concepts or components at the same time. I feel that Kafka is concise enough in terms of design. This article focuses on the excellent throughput performance of Kafka and describes the various "Black technologies" used in its design and implementation ".

Broker
Unlike in-memory Message queues such as Redis and MemcacheQ, Kafka is designed to write all messages to hard disks with low-capacity, in exchange for stronger storage capabilities. In fact, Kafka's hard disk usage does not bring too much performance loss, and a "near path" is copied in a regular manner ".

First of all, the reason is that Kafka only performs Sequence I/O on the disk. Due to the special reading and writing of the message system, there is no problem. For disk I/O performance, refer to a group of test data (Raid-5, 7200 rpm) officially provided by Kafka ):

Sequence I/O: 600 MB/s
Random I/O: 100KB/s

Therefore, only Sequence I/O restrictions are imposed to avoid the impact of low disk access speeds on performance.

Next, let's talk about how kafka "describes the path ".

First, Kafka relies heavily on the PageCache feature provided by the underlying operating system. When a write operation is performed on the upper layer, the operating system only writes data to PageCache and marks the Page attribute as Dirty. When a read operation occurs, search from PageCache first. If a page is missing, the disk is scheduled and the required data is returned. In fact, PageCache uses as much free memory as the disk cache. At the same time, if other processes apply for memory, the cost of recycling PageCache is very small, so modern OS supports PageCache.
Using the PageCache function can avoid caching data in the JVM. The JVM provides us with powerful GC capabilities and introduces some problems that are not applicable to Kafka.

Kafka architecture design of the distributed publish/subscribe message system

Apache Kafka code example

Apache Kafka tutorial notes

Principles and features of Apache kafka (0.8 V)

Kafka deployment and code instance

Introduction to Kafka and establishment of Cluster Environment

• If the cache is managed in Heap, jvm gc threads frequently scan Heap space, causing unnecessary overhead. If the Heap is too large, executing a Full GC operation poses a great challenge to the system availability.
• All objects in the JVM may inevitably carry an Object Overhead (never underestimate), so the effective space usage of the memory will be reduced.
• All In-Process caches share the same PageCache In the OS. Therefore, you can increase the cache space by at least doubled by caching only in PageCache.
• If Kafka is restarted, all In-Process Cache will be invalid, while PageCache managed by OS can still be used.

PageCache is only the first step. Kafka uses Sendfile technology to further optimize performance. Before interpreting Sendfile, we will first introduce the traditional network I/O operation procedures, which are roughly divided into the following four steps.

1. the OS reads data from the hard disk to the PageCache in the kernel.
2. the user process copies data from the kernel area to the user area.
3. the user process then writes data to the Socket, and the data enters the Socket Buffer in the kernel zone.
4. the OS then copies the data from the Buffer to the Buffer of the NIC to complete the sending.

The entire process involves two Context switches and four System calls. Duplicate copying of the same data between the kernel Buffer and the user Buffer is inefficient. Steps 2 and 3 are not necessary. data can be copied directly in the internal kernel. This is exactly the problem solved by Sendfile. After Sendfile is optimized, the entire I/O process is like the following.

It is not difficult to see from the above introduction that the original design of Kafka is to make every effort to complete data exchange in the memory, whether as an external message system or internal interaction with the underlying operating system. If the production and consumption progress between Producer and Consumer are properly matched, no I/O can be exchanged for data. This is why I say that Kafka's use of "Hard Disk" does not cause excessive performance loss. The following are some indicators I have collected in the production environment.
(20 Brokers, 75 Partitions per Broker, 110 k msg/s)

For more details, please continue to read the highlights on the next page:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More