One question that is often asked is: is Kafka broker really stateless? There is such a statement on the Internet:
Under normal circumstances, consumer will increase this offset linearly after consuming a message. Of course, consumer can also set offset to a smaller value and re-consume some messages. Because Offet is controlled by consumer, Kafka Broker is stateless ...
I guess the author's point is that the broker does not save the state of the consumer. If this is the case, there is no problem with the broker's state of being stateless. In practice, however, broker is a stateful service: Each broker maintains state information for all nodes and topic partitions on the cluster in memory--kafka this part of the state information is the metadata cache (metadata cache). This article will discuss the design and implementation of this metadata cache.
1. What is stored in the cache?
First, let's look at what's stored in the cache, and we use Kafka 1.0.0 as the analysis object. The information stored in the Metadata cache is rich in almost every aspect of the Kafka cluster, including:
- The broker ID where the controller is located, that is, which broker is stored in the current cluster
- Information for all brokers in the cluster: for example, the ID of each broker, the rack information, and the configured set of connection information (for example, the plaintext and SASL listeners are configured with two sets of connection information, each using different security protocols and ports, or even host names may be different)
- Information for all nodes in the cluster: Strictly speaking, it is somewhat repetitive with the previous one, although this is grouped by broker ID and listener type. For very large clusters, this cache allows you to quickly locate and locate a given node's information without traversing the contents of the previous item, which is an optimization
- Information for all partitions in the cluster: the so-called partition information refers to the partition's leader, ISR and AR Information, and the collection of replicas that are currently in the offline state. This section of data is grouped by topic and partition ID to quickly find the current state of each partition. (Note: AR represents assigned replicas, which is the copy collection assigned to the partition when the topic is created)
2. Does each broker save the same cache?
Yes, at least Kafka at design time. Vision: Each Kafka broker maintains the same cache so that the client program (clients) randomly sends requests to any broker to get the same data. That's why any broker can handle the metadata request from clients: because it's on every broker! To know that there are currently 38 types of requests for Kafka, there are few that can do this. The ability for each broker to handle can shorten the latency of the request being processed and thus improve the throughput of the overall clients, so it is worthwhile to use space to change some time.
3. How is the cache updated?
As mentioned earlier, using space to change time has the advantage of reducing latency and increasing throughput, but the disadvantage is that you need to deal with cache updates and maintain consistency. How does Kafka update the cache at this time? In a nutshell, consistency is maintained by sending asynchronous update requests (Updatemetadata request). Since it is asynchronous, the cache information for all brokers on a given point-in-time cluster may not be exactly the same. But in actual usage scenarios, this weak consistency does not seem to be much of a problem. The reasons are as follows: 1. Clients not always need to request metadata, and will be cached locally; 2. Even if the obtained metadata is invalid or expired, clients usually has a retry mechanism to fetch the metadata again on the other broker; 3. The cache update is lightweight and simply updates some in-memory data structures without much cost. Therefore, it is safe to assume that each broker has the same cache information.
The specific update operation is actually done by the controller. The controller sends a UPDATEMETADATA request to a specific broker in a certain scenario to order the brokers to update their respective caches, which, once they receive the request, begin a full-scale update-that is, emptying all the current cache information, Use the data in the Updatemetadata request to repopulate the cache.
4. When does the cache update?
The question is actually the same as: when does a controller send a updatemetadata request to a specific broker? If you start the analysis from the source, then there are too many scenarios involved, such as when the controller starts, when the new broker starts, when the broker is updated, when the copy is redistributed, and so on. We just have to remember that these caches need to be updated whenever there is a change in the broker or partition data in the cluster.
A frequently asked example of how the newly added broker in a cluster obtains these caches, and how does the other broker know about it? When a new broker starts, it registers in zookeeper, and the controller that listens to zookeeper immediately senses the new broker's accession, At this point the controller updates its own cache (note: This is the controller's own cache, not the metadata cache discussed in this article) to add this broker to the current broker list. It then sends a UPDATEMETADATA request to all the brokers in the cluster (including the newly added broker) to update the metadata cache. Once these broker update caches are complete, they know the presence of this new broker, and because the new broker also updates the cache, it now has all the state information for the cluster.
5. Current issues?
As mentioned earlier, the update cache is now completely driven by the controller, so the payload of the controller's broker will greatly affect this part of the operation (in effect, it will affect all controller operations). According to the current design, the controller's broker is still performing other clients request processing logic as a common broker, so if controller broker is busy with various clients requests (such as production or consumer messages), Then the request for this update will be backlog, resulting in a delay or even cancellation of the update operation. The root cause of this is that the current controller treats the data class request and the control class request without any priority processing--controller treats these requests without discrimination, and in fact we prefer that the controller can give the control class request a higher priority. The community has now embarked on a transformation of the current design, and it is believed that this issue can be resolved in future releases.
This article explores some of the contents of the metadata cache, because time is limited and does not cover every aspect. But it should be helpful for us to understand how the cache works.
Kafka Meta data Caching (metadata cache)