Recently research producer load Balancing strategy,,, I in the Librdkafka in the code to implement the partition value of the polling method,, but in the field verification, his load balance does not work,, so to find the reason; The following is an article describing Kafka processing logic , reproduced here, study a bit.Apache Kafka series of producer processing logicTags: Kafka producerkafka producer processing logic Kafka producer processing logic Apache Kafka series2014-05-23 11:42 3434 People read Comments (2) favorite reports Classification:Apache Kafka ($)
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Directory (?) [+]
Apache Kafka China Community QQ Group: 162272557
Reprinted from a colleague (Dong Zhong) wrote a wiki blog
Kafka producer Processing Logic
Kafka producer generated data sent to Kafka Server, the specific distribution logic and load balancing logic, all maintained by producer.
Kafka structure diagram
Kafka producer Default Call logic
Default partition logic
1. Distribution Logic without key
Every topic.metadata.refresh.interval.ms time, randomly select a partition. All records within this time window are sent to this partition.
A partition is also re-selected after sending data error
2. Distribution according to Key
Hash the key and then model the number of partition
Utils.abs (key.hashcode)% numpartitions |
How to obtain partition leader information (meta data)
After deciding which partition to send to, you need to make sure which broker the partition is leader to decide where to send.
Specific implementation Location
Kafka.client.clientutils#fetchtopicmetadata |
Implementation scenarios
1. Get partition metadata from broker. Because Kafka all brokers have all of the metadata, any broker can return all of the meta data
2. Broker selection strategy: Randomly sort the broker list, start Access from the first broker, and if there is an error, access the next
3. Error handling: Request metadata to the next broker after an error
Attention
- Producer is getting metadata from the broker and does not care about zookeeper.
- When the broker changes, the ability of the producer to get the metadata does not change dynamically.
- The list of brokers used when getting metadata is determined by metadata.broker.list in the configuration of producer. As long as there is a normal service on the machine in this list, producer can get the metadata.
- After getting the metadata, producer can write data to the broker in the non-metadata.broker.list list
Error handling
The Send function of producer does not return a value by default. Error handling has a EventHandler implementation.
Defaulteventhandler error handling is as follows:
- Get the data that went wrong
- Wait for a time interval, determined by the configuration retry.backoff.ms the length of time
- Re-fetch meta data
- Re-send data
The number of error retries is determined by configuration message.send.max.retries
Defaulteventhandler throws an exception when all retries fail. The code is as follows
if (Outstandingproducerequests.size >0) { ProducerStats.failedSendRate.mark () Val correlationidend = Correlationid.get () error ("Failed to send requests for topics%s with correlation IDs in [%d,%d]" . Format (Outstandingproducerequests.map (_.topic). Toset.mkstring (","), Correlationidstart, correlationidend-1)) thrownewfailedtosendmessageexception ("Failed to send messages after"+ Config.messagesendmaxretries +"tries.", null) } |
Please specify reproduced from: http://write.blog.csdn.NET/postedit/26687109
Apache Kafka series of producer processing logic