When key is null, Kafka sends the message to which partition?

Source: Internet
Author: User

When you write Kafka producer, the Keyedmessage object is generated.

Keyedmessage<k, v> keyedmessage = new keyedmessage<> (topicname, key, message)

Here the key value can be null, in this case, Kafka will send this message to which partition? According to Kafka's official documentation, the default partition class randomly picks a partition:

The third property "Partitioner.class" defines, what class-to-determine which Partition in the Topic of the message is To is sent to. This was optional, but the for any non-trivial implementation-going to want-implement a partitioning scheme. More on the implementation of this class later. If you include a value for the key but haven ' t defined a partitioner.class Kafka would use the default partitioner. If The key is null and then the Producer would assign the message to a random Partition.

But the sentence is quite misleading.

Literally, this is no problem, but the randomness here is to randomly select one after the parameter "topic.metadata.refresh.ms" is refreshed, always using a unique partition for that time period. By default, a new partition may be re-selected every 10 minutes. But I believe most programmers, like me, understand that each message will randomly select a partition. The
can see the relevant code:

Private def getpartition (topic: string, key: any, topicpartitionlist: seq[ Partitionandleader]): int = {    val numpartitions =  Topicpartitionlist.size    if (numpartitions <= 0)        throw new unknowntopicorpartitionexception ("topic "  + topic +  "  doesn ' t exist ')     val partition =       if (key == null)  {        // if the  key is null, we don ' t really need a partitioner         // So we look up in the send  Partition cache for the topic to decide the target partition          val id = sendpartitionpertopiccache.get (Topic)          Id match {          case some (PARTITIONID)  =>            // directly  Return the partitionid without checking availability of the leader,             // since we want  to postpone the failure until the send operation anyways             partitionId           case None =>              val availablepartitions = topicpartitionlist.filter (_. leaderbrokeridopt.isdefined)             if  (Availablepartitions.isempty)                throw new leadernotavailableexception (" no leader for any partition in topic  " + topic)              val index = utils.abs (RANDOM.NEXTINT)  % availablePartitions.size             Val partitionid = availablepartitions (Index) .partitionid             sendpartitionpertopiccache.put (Topic, partitionid)              partitionId         }      } else         Partitioner.partition (key,  numpartitions)     if (partition < 0 | |  partition >= numpartitions)       throw new  Unknowntopicorpartitionexception ("invalid partition id: "  + partition +  "  for topic  " + topic +        ";  valid values are in the inclusive range of [0,  " +  ( numPartitions-1)  +  "]"     trace ("Assigning message of topic  %s and key %s to a selected partition %d ". Format (topic, if   (Key == null)   "[None]"  else key.tostring, partition))      partition  }

If key is null, it will check the cache partition from Sendpartitionpertopiccache, and if not, randomly select a partition, otherwise the cached partition will be used.

LinkedIn Engineer Guozhang Wang explains the problem in the mailing list,
The initial Kafka was to randomly select one partition at a time, as most users understood, and later changed to select a partition periodically, in order to reduce the number of sockets on the server segment. This is, however, misleading, and it is said that the 0.8.2 version has been changed back to random selection at a time. But I don't see any changes to the 0.8.2 code.

So, if possible, set a key value for Keyedmessage.

When you write Kafka producer, the Keyedmessage object is generated.

Keyedmessage<k, v> keyedmessage = new keyedmessage<> (topicname, key, message)

Here the key value can be null, in this case, Kafka will send this message to which partition? According to Kafka's official documentation, the default partition class randomly picks a partition:

The third property "Partitioner.class" defines, what class-to-determine which Partition in the Topic of the message is To is sent to. This was optional, but the for any non-trivial implementation-going to want-implement a partitioning scheme. More on the implementation of this class later. If you include a value for the key but haven ' t defined a partitioner.class Kafka would use the default partitioner. If The key is null and then the Producer would assign the message to a random Partition.

But the sentence is quite misleading.


When key is null, Kafka sends the message to which partition?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.