Kafka Producer related Code analysis
tags (space delimited): Kafka
Kafka producer sends the user's message to the Kafka cluster (which is exactly what is sent to the broker). This article will analyze the implementation of producer related code.
Class Kafka.producer.Producer
If you implement your own Kafka client to send messages, you are using the interface provided by this class to send messages. (If you are not familiar with how to use the Producer API to send messages, see the official example). This class provides both synchronous and asynchronous ways of sending messages.
An asynchronous send message is implemented based on an interface that sends messages synchronously. The implementation of the asynchronous send message is simple, after the client message is sent, it is put into a queue and then returned. Producer another thread (Producersendthread) continuously pulls the message out of the queue and then calls the interface that synchronously sends the message to send the message to the broker.
Producer Send synchronization message is delegated to EventHandler do, EventHandler is an interface, specifically implemented as Defaulteventhandler. Their simplified class diagram is as follows:
As you can see, the members of the producer class Producersendthread and queue are in order to send the asynchronous message, EventHandler is to send the synchronous message, of course, the asynchronous message also needs it. Keyedmessage is the encapsulation of messages sent by the user. SEQ is a sequence in Scala and can be seen as a list in Java.
The simplified class diagram for Keyedmessage is as follows:
Class Defaulteventhandler
Defaulteventhandler is the only implementation of interface EventHandler. From the previous section, you can see that the message format producer sent to EventHandler is Keyedmessage. Let's look at what needs to be done before sending keyedmessage to broker.
Serialization of
The KV in Keyedmessage is a custom type that is specified by the user, and is sent as a binary stream when sent to the broker, so it is also necessary to convert the user-defined type of data into a binary stream. When initializing the producer, you need to configure Serializer.class, which is used to handle this matter. In addition, multiple message groups are synthesized messageset and compressed in accordance with the user-specified compression method.
Find the corresponding broker
The topic of the message is specified in Keyedmessage, and a topic can have multiple partition, each partition with multiple replica, managed by multiple brokers, with one leader. Only leader broker can respond to client read-write requests.
Thus, before sending keyedmessage to broker, it is necessary to find the corresponding leader broker for that message, in the following steps:
- Find out all the partition of the topic,
- Find out which partition the keyedmessage should be sent to, and configure Partitioner.class to partition the message when initializing producer.
- Find the leader broker where the corresponding partition are located.
Finally, Defaulteventhandler encapsulates the serialized message into Producerrequest, which itself does not have the logic to send producerrequest to the broker. Instead, it is handed over to Syncproducer to continue the process of sending it back.
Producerpool, Syncproducer and Blockingchannel
Together they are the final data-sending task. First look at their class diagram:
There is a hashmap in Producerpool, whose key is brokerid,value for the syncproducer connected to the broker. Therefore the more accurate name of Producerpool should be syncproducerpool.
Blockingchannel can be seen as a socket client, which has two member variables, namely machine name and port number. Its Connect method opens to the socket of the corresponding machine. Its Send method can send Requestorresponse, which is the place where the data is actually sent.
Syncproducer provides two send methods for sending Producerrequest and Topicmetadatarequest, respectively. It internally calls the Blockingchannel to send the data.
Summary
Producer The simplified sequence diagram for sending data is as follows:
It can be seen that the responsibilities of each class are clear, Blockingchannel is responsible for sending the data at the lowest level, and Syncproducer is responsible for sending the request to a designated broker, Defaulteventhandler is responsible for data conversion and choosing the right broker, and the producer that is used directly to the client provides both synchronous and asynchronous delivery methods.
?
Kafka Producer related Code analysis