Datapipeline | Apache Kafka actual Combat author Hu Xi: Apache Kafka monitoring and tuning

Source: Internet
Author: User
Tags kafka connect kafka streams

Hu Xi, "Apache Kafka actual Combat" author, Beihang University Master of Computer Science, is currently a mutual gold company computing platform director, has worked in IBM, Sogou, Weibo and other companies. Domestic active Kafka code contributor.

Objective
Although Apache Kafka is now fully evolved into a streaming processing platform, most users still use their core functions: Message Queuing. For how to effectively monitor and tune Kafka is a big topic, many users have such a problem, today we will discuss.

I. Summary of KAFKA

Before discussing specific monitoring and tuning, I would like to use a PPT diagram to briefly illustrate the various components of the current Kafka ecosystem. As I said earlier, Kafka has evolved into a stream processing platform that, in addition to the core Message Queuing component Kafka Core, has introduced Kafka Connect and Kafka Streams two new components: the former is responsible for the data transmission between the Kafka and the external system, while the latter is responsible for the real-time stream processing. Lists some of the key Kafka concepts.

Second, Kafka monitoring

I'm going to discuss Kafka's monitoring from five dimensions. The first is to monitor the host where the Kafka cluster resides, and the second is to monitor the performance of the Kafka broker JVM; 3rd, we want to monitor the performance of Kafka broker; and we want to monitor the performance of Kafka clients. What we are referring to here is the generalized client--perhaps the producer or consumer we write ourselves, or the producer or consumer that the community helps us with, such as the sink/source or streams of connect, and finally we need to monitor the interaction between the servers.

1. Host Monitoring

Personal view of the host monitoring is the most important. Because a lot of online environmental problems first show that the symptoms are some of the performance of the host has obvious problems. It is usually the Ops people who first discover them and tell us what's wrong with the machine, and for Kafka host monitoring is usually the first step in discovering the problem. This page lists common metrics, including CPU, memory, bandwidth, and so on. It is important to note the CPU usage statistics. You may have heard this: my Kafka Broker CPU usage is 400%, what's going on? For such a problem, we must first understand how this usage rate is observed? Many people use the VSS or RSS fields in the top command to characterize CPU usage, but in fact they are not real CPU usage-it's just a fraction of the time slices that all CPUs spend together on the Kafka process. For example, if there are 16 CPUs on the machine, then as long as these values are not more than or close to 1600, then your CPU usage is actually very low. Therefore, the meanings of each field in these commands must be understood correctly.

This page PPT to the right of a book, if you want to monitor the performance of the host, I personally suggest that this "systemsperformance" is enough. A very authoritative book, recommended for everyone to read.

2. Monitor the JVM

The Kafka itself is a common Java process, so any method that is appropriate for JVM monitoring is interlinked with the monitoring Kafka. The first step is to understand the Kafka application first. Let's say you know what the GC frequency and latency are for the Kafka broker JVM, and what the size of the surviving object will be after each GC. With this information, we can be clear about the direction of the tuning behind. Of course, we're not a very senior JVM expert after all, so there's no need to pursue cumbersome JVM monitoring and tuning too much. You just need to focus on the big things. In addition, if you have limited time but want to quickly grasp the JVM monitoring and tuning, recommend reading "Java performance".

3.per-broker Monitoring

First, make sure that the broker process is in a startup state? It sounds a little funny, but I do have a situation like this. For example, when deploying Kafka on Docker, it is easy to start a process but the service does not start successfully. Under normal startup, when a Kafka server comes up, there should be two ports, one port is 9092 regular port, and a TCP link will be built. There is also a port that is used for JMX monitoring. Of course there are more than one broker, then the controller machine will maintain a TCP connection for each broker. This can be consciously verified in real-time monitoring.

For broker monitoring, we do this primarily through the JMS indicator. People who have used Kafka know that the Kafka community offers a particularly large number of JMS indicators, many of which are of little use. I have listed some of the more important: the first is the broker machine in and out of bytes, is similar to I can monitor the network card traffic, I must monitor this indicator, and real-time comparison with your network card bandwidth-if found that the value is very close to the bandwidth, it proves that the broker load is too high, Either add a new broker machine or balance the load on the broker to other machines.

There are also two thread pool idle usage concerns, and it's best to ensure that they don't have a value below 30%, otherwise the broker is already very busy. The number of thread pool threads needs to be adjusted at this time.

Next is the log that monitors the broker server. The log contains very rich information. The log described here is not only the log of the broker server, but also the log of the Kafka controller. We need to check frequently for oom errors in the logs or to keep an eye on the error messages thrown in the logs.

We also need to monitor the running state of some critical background threads. Personally, there are two more important threads that need to be monitored: A log cleaner thread-the thread that performs the data compaction, and if the thread goes wrong, the user is usually not aware of it, and then discovers that the topic of all compact policies will become larger until it fills up all of the disk space Another thread is the replica pull thread, which is where follower broker uses the thread to pull data from leader in real time. If the thread "hangs up", the user is usually not aware of it, but will find that follower no longer pulls the data. Therefore, we must periodically check the status of these two threads, if they are found to mean termination, then find the log to find the corresponding error message.

4.Clients Monitoring

Client monitoring this piece, my side will be divided into two, respectively, to discuss the monitoring of producers and consumers. The producer sends a message to Kafka, before we monitor, we should at least know how much the RTT is between the client machine and the broker-side machine. For the kind of cross-data center or offsite situation, RTT is very large, if not to do special tuning, it is impossible to have too high TPS. At present Kafka producer is a dual-threading design mechanism, divided into the user main thread and sender threads, when the sender thread hangs, the front-end user is not aware of, but performance for producer Send Message failed, So it is better for the user to monitor the status of this sender thread.

There is also the processing delay to monitor produce requests. The total time that a message is sent from the producer side to Kafka broker for processing and then returned to producer. It is best to be aware of the time-consuming aspects of the entire link. Because in many cases, if you want to upgrade the producer's TPs, understand the bottleneck in the entire link can be targeted. I'll discuss how to disassemble this link in the back ppt.

Now talk about the consumer. The consumers here are talking about the new version of the consumer, the Java consumer.

The community has been highly recommended to continue using older versions of consumers. The new version of the consumer is also a double-threaded design, followed by a heart jumper, if the thread hangs, the foreground thread is uninformed. Therefore, it is better for the user to regularly monitor the survival of the heartbeat thread. The heart jumper periodically sends a heartbeat request to the Kafka server, telling Kafka that the consumer instance is still alive to avoid coordinator mistakenly thinking that the instance has "died" to turn on rebalance. Kafka provides a number of JMX indicators that can be used to monitor consumers, the most important of which is the latency monitoring of consumption, known as Consumerlag.

Assuming that producer produced 100 messages and the consumer reads 80, the lag is 20. Obviously the less backward the better, this shows that consumers are very timely, users can also use the tool line command to check the lag, and even write Java API to check. There is also a lead indicator for the lag, which characterizes the progress of the consumer's first message. For example, the earliest consumption shift is 1, if the consumer's current consumption of the message is 10, then the lead is 9. The bigger the better for the lead, the better that this consumer may be at a standstill or consumption is very slow, essentially lead and lag is one thing, the reason is listed because I developed the lead indicators, but also to play an advertisement.

In addition to these, we also need to monitor the distribution of the consumer group partition, to avoid the occurrence of an instance is allocated too many partitions, resulting in severe load imbalance situation. In general, if all consumers in a group subscribe to the same topic, there is usually no obvious allocation skew. This imbalance can occur very easily once each instance subscribes to a different topic and the number of partitions per topic is uneven. Kafka currently provides 3 strategies to help users complete partition allocation, the latest strategy is the viscous distribution strategy, it can guarantee the absolute fairness, we can go to try.

The last is to monitor the time of the rebalance--at present, the rebalance performance of the super-many instances in the group is very poor and may be of the hour level. And the more tragic is that there is no better solution at present. So, if your consumer is particularly special, you will have this problem, you have to monitor the two steps of the time spent to see if it satisfies the needs, if not satisfied, to see if the consumer can be removed, to minimize the number of consumers.

5.inter-broker Monitoring

The last dimension is the performance of the monitoring broker, mainly refers to the replica pull. Follower copy of the data in real-time pull leader, we naturally hope that the faster the pull process, the better. Kafka provides a particularly important JMX indicator, called the number of partitions with insufficient backups, for example, I have specified this message, should be saved on three brokers, assuming that only one or two brokers to save the message, then this message is located in the partition is called "Insufficient backup" partition. This situation is of particular concern, as it is possible to cause data loss. This is said in the Kafka authoritative guide: If you can only monitor a Kafka JMX indicator, then monitor this and make sure that the value is always 0 in your Kafka cluster. Once the situation is greater than 0, hurry up and handle it.

A more important indicator is the number of controllers that are represented. The whole cluster should ensure that only one machine indicator is 1, the other should be 0, if you find that there is a machine is 2 or 3, must be a brain fissure, it should be checked to see if there is a network partition. Kafka itself is not able to fight the brain crack, completely rely on zookeeper to do, but if the real network partition, there is no way to deal with, rather quickly fail fast off.

Third, monitoring tools

At present, there is no Kafka monitoring tool is recognized as excellent, each has its own characteristics but also some fatal defects. We discuss some of the common monitoring tools one by one.

1.Kafka Manager

It should be said that in all free monitoring frameworks, Kafka Manager is the most popular. It was originally open source by Yahoo, the function is very complete, show the data is very rich. In addition, users can perform some simple cluster management operations on the interface. Even more gratifying is that the framework is currently being maintained, so using Kafka Manager to monitor Kafka is a good choice.

2.Burrow

Burrow was an open source for the second half of last year, specifically monitoring consumer information. This framework was just beginning to open source, I still have high hopes for it, after all, is Kafka community Committer personally prepared. But the problem with burrow is that there is no UI interface and it is not convenient for operations. In addition, it is written in the go language, you have to use, you must build the Go language environment, and then compile the deployment, in short, it is not very convenient to use. There is its update is not very frequent, has a bit of a semi-abandoned state, we might as well try.

3.Kafka Monitor

Strictly speaking, it is not a monitoring tool, it is specialized in Kafka cluster system testing. The indicators to be monitored can be set by the user themselves, mainly to do some end-to-end testing. For example, you set up a Kafka cluster, and you want to test the performance of the end-to-end process: from sending messages to consumers reading messages. The advantages of this framework are also written by the Kafka Community team, the quality is guaranteed, but the update is not very frequent, it seems that a few months have not been updated.

4.Kafka Offset Monitor

Kafkaoffsetmonitor is one of the earliest Kafka monitoring tools I used, but also to monitor the consumer displacement, only then Kafka to keep the displacement on the Zookeepr. The interface of this framework is very beautiful, the domestic use of a lot of people. But now there is a problem, because we are now using a new version of the consumer, which is not particularly well supported by the framework at the moment. And another problem is that it is no longer maintained, and there may not be any updates for 1-2 years.

5.Kafka Eagle

This is developed by the people themselves, I do not know which Daniel developed the specific, but in the Kafka QQ group inside many respected, because the interface is very clean and beautiful, the above has a good data display.

6.Confluent Control Center

Control Center is currently the most complete Kafka monitoring framework I can collect, except for the purchase of Confluent Enterprise Edition, which means it is paid for.

In general, if you are a Kafka cluster operations operator, it is recommended to use Kafka Manager for monitoring, and then customize the development of specific tools or frameworks based on actual monitoring requirements.

Four, system tuning

One of the main purposes of Kafka monitoring is tuning the Kafka cluster. Here is a list of some common operating system-level tuning.

the first is to guarantee the size of the page cache-at least to set the page cache to a log segment size. We know that Kafka uses a large amount of page caching, so long as the page cache is large enough, consumers have a large probability of reading the message to ensure that it can hit the data in the page cache directly without having to read from the underlying disk. So just make sure that the page cache satisfies the size of a log segment.

the second is the number of tuning file open. A lot of people are a bit afraid of this resource. This is actually a very inexpensive resource, and setting a larger initial value is usually not a problem.

The third is tuning the Vm.max_map_count parameters. It is mainly applicable to cases where the number of subjects on the Kafka broker is too large. Kafka log section of the index file is a mapping file mechanism to do, so if there is more than a multi-day log, the number of such index file is bound to be a lot of, very easy to explode this resource limit, so for this situation is generally appropriate to adjust this parameter.

The four is the swap setting. Many articles say this value is set to 0, is completely prohibit swap, I personally do not recommend this, because when you set to 0, once your memory is exhausted, Linux will automatically open the Oom killer and then randomly find a process to kill. This is not the result we want to deal with. Instead, I recommend setting the value to a smaller value that is closer to 0, so that when my memory runs out, I try to turn on a small percentage of swap, which can cause the broker to become very slow, but at least gives the user the opportunity to identify the problem and deal with it.

Fifth JVM heap size. first of all, in view of the current Kafka new version already does not support JAVA7, and Java 8 itself is not updated, and even Java9 actually do not do, directly do Java10, so I suggest Kafka at least with Java8 to build. As for the size of the heap, the individual thinks 6-10g is sufficient. If there is a heap overflow, mention Jira to the community and let them see exactly what the problem is. Because this is the case, even if the user is to increase the size of the heap, it just slows down oom and is unlikely to solve the problem fundamentally.

Finally, it is recommended to use a dedicated multi-disk to build Kafka clusters. Since the 1.1 release, Kafka officially supports JBOD, so there is no need to use a raid at the bottom.

V. Four levels of Kafka tuning

Kafka tuning can typically be expanded from 4 dimensions, namely throughput, latency, persistence, and availability. Before I go into these areas, I would like to recommend that the user ensure that the client is consistent with the server-side version. If the version is inconsistent, there is a downward conversion issue. For example, the server side to save the high-level message, when the lower version of the consumer request data, the server will do the conversion, the high-level version of the message to the lower version before sending to the consumer. The matter itself is very, very inefficient. Many articles discuss the reason why Kafka is fast, and it's about 0 copy technology-that is, data does not need to be copied back and forth in the page cache and in the heap cache.

In short, producer the production of the message to the page cache, if the two sides of the same version, you can directly push this message to consumer, or consumer direct pull, this process does not need to put the message into the heap cache. But if you're going to do a down conversion or a version inconsistency, you'll have to put the data on the heap again, and then put it back on the consumer, which is very slow.

1.Kafka Tuning – Throughput

The tuning throughput is that we want to do more things in less time. The parameters that the client needs to adjust are listed here. As I said before, producer is putting the message in the buffer, and the backend sender thread takes it out of the cache and sends it to the broker. This involves a packaging process, which is a batch operation, not a one-piece send. Therefore, the size of this package is closely related to TPS. Generally, this value will increase the TPS, but it will not be unlimited. However, the disadvantage of raising this value is the increase in message latency. In addition to adjusting the batch.size, setting up compression also boosts the TPS, which reduces network transmission IO. The compression effect of the current Lz4 is the best, if the client machine CPU resources are sufficient then it is recommended to turn on compression.

For the consumer side, tuning TPS is not a good way to think of is to adjust the fetch.min.bytes. Increasing the value of this parameter appropriately can increase the TPS on the consumer side. For the broker side, the usual bottleneck is that the copy pull message is too long, so you can increase the Num.replica.fetcher value appropriately and pull the data at the same time with multiple threads to speed up the process.

2.Kafka Tuning – Delay

The so-called delay refers to the time the message is processed. In some cases we naturally hope that the sooner the better. For this aspect of tuning, the consumer side can do a little, simply keep the fetch.min.bytes default, so that consumer can immediately return to read the data. In this case, there may be a doubt that TPs and inertia are not the same thing. Suppose to send a message delay is 2ms,tps nature is 500, because a second can only send 500 messages, in fact, the relationship is not simple. Because I send a message for 2 milliseconds, the TPS can be boosted if the messages are cached uniformly. Let's say the message is still 2ms, but I'll wait 8 milliseconds, and I'll probably collect 10,000 messages within 8 milliseconds, and then I'll send it again. It's equivalent to sending 10,000 messages in 10 milliseconds, and you can count the amount of TPS. In fact, Kafka producer is the principle of implementation in design.

3.Kafka Tuning – Message Persistence

Message persistence is essentially a message that is not lost. Kafka the promise of not losing a message is conditional. Before met a lot of people said I sent Kafka message, send failed, the message was lost, how to do? Strictly speaking, Kafka does not consider this to be a loss of information because the message is not placed inside the Kafka. Kafka only to the information that has been submitted to do conditional non-loss protection.

If you want to tune persistence, for producer, first set the retry to prevent the message from failing because of transient jitter on the network. Once the retry is turned on, you also need to prevent the problem from being scrambled. For example, I send messages 1 and 2, message 2 sent successfully, message 1 send failed retry, so that message 1 in the message 2 after the entry into the Kafka, that is, causing chaos. If the user is not allowed to do so, it is also necessary to explicitly set Max.in.flight.requests.per.connection to 1.

The other parameters listed on this page are general parameters, such as the unclean.leader.election.enable parameter, and it is best to set it to false, that is, "dirty" copies are not allowed to be elected as leader.

4.Kafka Tuning – Availability

Finally, usability, contrary to just persistence, I allow messages to be lost as long as the system is highly available. So I need to set the consumer heartbeat timeout to a smaller value, and if the consumer doesn't finish processing the message within a given time, the instance might be kicked out of the consumer group. I want other consumers to know this decision faster, so turn down the value of this parameter.

Six, positioning performance bottleneck

Here is the performance bottleneck, which is strictly not tuning, which is to solve the performance problem. For producers, if the bottleneck for sending messages is slow, we need to disassemble each step in the sending process. As this figure shows, there are 6 steps to sending the message. The first step is that the producer puts the message to the broker, the second to third step is that the broker takes the message to the local disk, the fourth step is to follower broker to pull the message from leader, and the fifth step is to create the response; Sixth step is to send it back, Tell me that I have finished the work.

In these six steps you need to determine where the bottleneck is? How do you know? --through different JMX indicators. For example, step 1 is slow, you may often encounter timeouts, if you frequently encounter the request timeout in the log, it means that 1 is very slow, it is appropriate to increase the time-out. If 2, 3 is slow, it may be that the disk IO is very high, resulting in very slow write data to disk. If step 4 is slow, look at the JMX indicator named Remote-time and you can increase the number of fetcher threads. If 5 is slow, the response in the queue causes the time to stay too long, which can increase the size of the network thread pool. 6 and 1 are the same, if you find that 1, 6 is often a problem, check your network. So, it's time-consuming to break down the whole thing. This is exactly which step of the bottleneck where, need to look at what kind of indicators, how to do the tuning.

Vii. Tuning of Java Consumer

Finally, say a little bit about the consumer tuning. There are two ways to use the current consumer, one is the same thread in the direct processing, the other is that I take a separate thread, consumer thread just do get messages, the real processing logic of the message into a separate thread pool to do. These two approaches have different usage scenarios: The first method is simpler to implement, because your message processing logic is written directly in a thread, but its flaw is that TPS may not be very high, especially when your client's machine is very strong, and you use single-threaded processing to be slow. Because you are not taking full advantage of CPU resources on the thread. The second approach has the advantage of being able to make full use of the hardware resources of the underlying server, which can be made very high, but it will be difficult to handle the commit displacements.

Finally, the parameters, but also the most on-line questions, these parameters are exactly what to do. The first parameter is the maximum time to control the consumer of a single processing message. For example, the set is 600s, then consumer give you 10 minutes to deal with. If the consumer cannot be processed within 10 minutes, Coordinator will assume that the consumer is dead, thereby opening the rebalance.

Coordinator is used to manage the consumer group coordinator, how the facilitator in the effective time, the consumer instance hangs off the message to other consumers, relying on the heartbeat request, so you can set heartbeat.interval.ms to a smaller value, such as 5s.

Eight, Q & A

Q1: Mr. Hu mentioned earlier that there is a port problem in the lower version and the higher version, I would like to ask the high version, the lower version of the problem?

A1: There will be.

Q2: Two modes, one is consumer how to do all the partition, in the inside to do the management. There is a problem that some consumer consume more slowly, because all partition consumption is tied to a thread. One consumes relatively slowly, one consumes relatively fast, waits for another. There is a plan, the consumer is relatively slow can be tentative, if it is related to the suspension, the frequent tentative time spent, is it relatively slow?

A2: One thread handles all partitions. If it's not very expensive, but it does appear, as you say, if a consumer has 100 partitions, the effect I see now is that some areas may starve to death for a certain period of time, for example, some partitions have no data in the long term, and there may be some partitions that have data, which may be the case. But you said the two methods themselves are not very expensive, because it is the structure of the memory changes, is the location of information, if the segment, the positioning information is temporarily switched off, does not involve a very complex data structure changes.

Q3: How do you decide the order?

A3: This thing is now done on the broker side, simply do polling, for example, there are 100 partitions, the first batch of random to you a batch of partitions, then these partitions will be queued to the end of the entire queue, from the other partition to start to you, to do as fair as possible.

Q4: When consumption will appear data tilt, how to understand this piece?

A4: Data skew. This situation is particularly prone to data skew in situations where each consumer's subscription information is different. For example, if I subscribe to topic 123, I subscribe to topic 456, and we are in the same group, these topic partitions are very different, it is possible that I subscribed to 10 partitions, you may subscribe to 2 partitions. If you are using a sticky distribution strategy, that guarantee does not show more than two different cases. The timing of this strategy was not short, it was launched in 0.11.

Click here to apply for free Datapipeline product trial

Datapipeline | Apache Kafka actual Combat author Hu Xi: Apache Kafka monitoring and tuning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.