When you write Kafka producer, the Keyedmessage object is generated.
Keyedmessage<k, v> keyedmessage = new keyedmessage<> (topicname, key, message)
Here the key value can be null, in this case, Kafka will send this message to which partition? According to Kafka's official documentation, the default partition class randomly picks a partition:
The third property "Partitioner.class" defines, what class-to-determine which Partition in the Topic of the message is To is sent to. This was optional, but the for any non-trivial implementation-going to want-implement a partitioning scheme. More on the implementation of this class later. If you include a value for the key but haven ' t defined a partitioner.class Kafka would use the default partitioner. If The key is null and then the Producer would assign the message to a random Partition.
But the sentence is quite misleading.
Literally, this is no problem, but the randomness here is to randomly select one after the parameter "topic.metadata.refresh.ms" is refreshed, always using a unique partition for that time period. By default, a new partition may be re-selected every 10 minutes. But I believe most programmers, like me, understand that each message will randomly select a partition. The
can see the relevant code:
Private def getpartition (topic: string, key: any, topicpartitionlist: seq[ Partitionandleader]): int = { val numpartitions = Topicpartitionlist.size if (numpartitions <= 0) throw new unknowntopicorpartitionexception ("topic " + topic + " doesn ' t exist ') val partition = if (key == null) { // if the key is null, we don ' t really need a partitioner // So we look up in the send Partition cache for the topic to decide the target partition val id = sendpartitionpertopiccache.get (Topic) Id match { case some (PARTITIONID) => // directly Return the partitionid without checking availability of the leader, // since we want to postpone the failure until the send operation anyways partitionId case None => val availablepartitions = topicpartitionlist.filter (_. leaderbrokeridopt.isdefined) if (Availablepartitions.isempty) throw new leadernotavailableexception (" no leader for any partition in topic " + topic) val index = utils.abs (RANDOM.NEXTINT) % availablePartitions.size Val partitionid = availablepartitions (Index) .partitionid sendpartitionpertopiccache.put (Topic, partitionid) partitionId } } else Partitioner.partition (key, numpartitions) if (partition < 0 | | partition >= numpartitions) throw new Unknowntopicorpartitionexception ("invalid partition id: " + partition + " for topic " + topic + "; valid values are in the inclusive range of [0, " + ( numPartitions-1) + "]" trace ("Assigning message of topic %s and key %s to a selected partition %d ". Format (topic, if (Key == null) "[None]" else key.tostring, partition)) partition }
If key is null, it will check the cache partition from Sendpartitionpertopiccache, and if not, randomly select a partition, otherwise the cached partition will be used.
LinkedIn Engineer Guozhang Wang explains the problem in the mailing list,
The initial Kafka was to randomly select one partition at a time, as most users understood, and later changed to select a partition periodically, in order to reduce the number of sockets on the server segment. This is, however, misleading, and it is said that the 0.8.2 version has been changed back to random selection at a time. But I don't see any changes to the 0.8.2 code.
So, if possible, set a key value for Keyedmessage.
When you write Kafka producer, the Keyedmessage object is generated.
Keyedmessage<k, v> keyedmessage = new keyedmessage<> (topicname, key, message)
Here the key value can be null, in this case, Kafka will send this message to which partition? According to Kafka's official documentation, the default partition class randomly picks a partition:
The third property "Partitioner.class" defines, what class-to-determine which Partition in the Topic of the message is To is sent to. This was optional, but the for any non-trivial implementation-going to want-implement a partitioning scheme. More on the implementation of this class later. If you include a value for the key but haven ' t defined a partitioner.class Kafka would use the default partitioner. If The key is null and then the Producer would assign the message to a random Partition.
But the sentence is quite misleading.
When key is null, Kafka sends the message to which partition?