Kafka provides a number of configuration parameters for Broker,producer and consumer. Understanding and understanding these configuration parameters is very important for us to use Kafka.
This article lists some of the important configuration parameters.
The Official document configuration is older, many parameters are changed, and some names have been altered. I also made corrections based on 0.8.2 's code in the process of tidying up.
Boker Configuration Parameters
The following table lists the important configuration parameters for Boker, and for more configuration please refer to Kafka.server.KafkaConfig
name |
Default Value |
Description |
Brokerid |
None |
Each boker has a unique ID as their name. This allows the Boker to switch to another host/port, and consumer still knows |
Enable.zookeeper |
True |
Allow registration to Zookeeper |
Log.flush.interval.messages |
Long.maxvalue |
The maximum number of messages accumulated before the data is written to the hard disk and the consumer is available |
log.flush.interval.ms |
Long.maxvalue |
The maximum time before the data is written to the hard disk |
log.flush.scheduler.interval.ms |
Long.maxvalue |
Checks whether the data is to be written to the hard disk at a time interval. |
Log.retention.hours |
168 |
Control how long a log remains for an hour |
Log.retention.bytes |
-1 |
Control log file Maximum size |
Log.cleaner.enable |
False |
Whether log cleaning |
Log.cleanup.policy |
Delete |
Delete or Compat. Other control parameters include Log.cleaner.threads,log.cleaner.io.max.bytes.per.second,log.cleaner.dedupe.buffer.size,log.cleaner.io.buffer.siz E log.cleaner.io.buffer.load.factor,log.cleaner.backoff.ms,log.cleaner.min.cleanable.ratio,log.cleaner.delete.retention.ms |
Log.dir |
/tmp/kafka-logs |
Specify the root directory of the log file |
Log.segment.bytes |
11024x7681024*1024 |
Single Log segment file size |
Log.roll.hours |
24 * 7 |
The maximum time to start a new log file fragment |
Message.max.bytes |
1000000 + messageset.logoverhead |
Maximum number of bytes for a socket request |
Num.network.threads |
3 |
Number of threads processing network requests |
Num.io.threads |
8 |
Number of threads processing IO |
Background.threads |
10 |
Background line Program |
Num.partitions |
1 |
Default number of partitions |
Socket.send.buffer.bytes |
102400 |
Socket So_sndbuff Parameters |
Socket.receive.buffer.bytes |
102400 |
Socket So_rcvbuff Parameters |
Zookeeper.connect |
Localhost:2182/kafka |
Specifies the zookeeper connection string, formatted as Hostname:port/chroot. Chroot is a namespace |
zookeeper.connection.timeout.ms |
6000 |
Specify the maximum timeout period for client connection zookeeper |
zookeeper.session.timeout.ms |
6000 |
Session timeout time to connect ZK |
zookeeper.sync.time.ms |
2000 |
The longest time for ZK follower to lag behind ZK leader |
High-level consumer configuration parameters
The following table lists the important configuration parameters for the high-level consumer.
For more configuration please refer to Kafka.consumer.ConsumerConfig
name |
Default Value |
Description |
GroupID |
GroupID |
A string that indicates the group to which a group of consumer |
socket.timeout.ms |
30000 |
Socket Timeout Time |
Socket.buffersize |
64*1024 |
Socket Receive Buffer |
Fetch.size |
300 * 1024 |
Controls the number of bytes of messages that are fetched in a request. This parameter is replaced by Fetch.message.max.bytes,fetch.min.bytes in 0.8.x. |
backoff.increment.ms |
1000 |
This parameter avoids repeating frequent pull data without new data. If you pull the empty data, you postpone the time |
Queued.max.message.chunks |
2 |
The consumer internal cache pulls back messages into a queue. This value controls the size of this queue |
Auto.commit.enable |
True |
If True,consumer periodically writes offset for each partition to zookeeper |
auto.commit.interval.ms |
10000 |
How often to write offset on the zookeeper |
Auto.offset.reset |
Largest |
If offset is returned, then smallest : Reset is automatically set to the minimum offset. largest : Automatically set offset to the maximum offset. Other values are not allowed, and an exception is thrown. |
consumer.timeout.ms |
-1 |
The default -1,consumer blocks indefinitely when no new messages are in the block. If you set a positive value, a timeout exception is thrown |
Rebalance.retries.max |
4 |
Maximum number of attempts when rebalance |
Producer Configuration Parameters
The following table lists the important parameters of the producer.
For more configuration please refer to Kafka.producer.ProducerConfig
name |
Default Value |
Description |
Serializer.class |
Kafka.serializer.DefaultEncoder |
The Kafka.serializer.Encoder interface must be implemented to encode objects of type T into Kafka message |
Key.serializer.class |
Serializer.class |
The serializer class of the Key object |
Partitioner.class |
Kafka.producer.DefaultPartitioner |
Kafka.producer.Partitioner must be implemented to provide a partitioning strategy based on key |
Producer.type |
Sync |
Specifies whether message sending is synchronous or asynchronous. Asynchronous ASYC batch send with Kafka.producer.AyncProducer, synchronous sync with Kafka.producer.SyncProducer |
Metadata.broker.list |
Boker List |
Use this parameter to pass in Boker and partition static information, such as HOST1:PORT1,HOST2:PORT2, which can be part of all Boker |
Compression.codec |
Nocompressioncodec |
Message compression, not compressed by default |
Compressed.topics |
Null |
In the case of compression, you can specify a specific topic compression, all compressed for the specified |
Message.send.max.retries |
3 |
Maximum number of message send attempts |
retry.backoff.ms |
300 |
Each attempt to increase the extra interval of time |
topic.metadata.refresh.interval.ms |
600000 |
The time at which metadata is periodically fetched. When the partition is lost and leader is not available, producer also proactively gets the metadata, and if 0, it is not recommended to get the metadata each time the message is sent. If negative, the metadata is only obtained if the failure occurs. |
queue.buffering.max.ms |
5000 |
The maximum time for cached data in the producer queue, just for ASYC |
Queue.buffering.max.message |
10000 |
Producer The maximum number of cached messages, just for ASYC |
queue.enqueue.timeout.ms |
-1 |
0 when the queue is full, the negative value is the block when the queue is full, and the value is the time of the block when the queue is full, just for ASYC |
Batch.num.messages |
200 |
Number of messages, just for ASYC |
Request.required.acks |
0 |
0 indicates that the producer does not have to wait for leader confirmation, 1 represents the need to leader confirm that the local log is written to it and immediately confirms that 1 means that all backups are completed after confirmation. Just for Sync |
request.timeout.ms |
10000 |
Confirm Time-out
|
Kafka.serializer.DefaultEncoder
The default of this encoder actually does not do any processing, receive what byte[] return what byte[]:
Class Defaultencoder(Props: Verifiableproperties = null) extends encoder[array[ byte] { override Tobytes (value: Span class= "Typ" >array[byte]): Array[byte] = Value}
NullEncoder
Returns null regardless of what is received:
Class Nullencoder[T](Props: Verifiableproperties = null) extends encoder[t< Span class= "pun" >] { override def Tobytes (value< Span class= "pun" >: T): array [byte] = null}
StringEncoder
Returns a string with the default UTF-8 format:
Class Stringencoder(Props: Verifiableproperties = Null) Extends Encoder[String] {Val Encoding= If(Props== Null) "UTF8" ElseProps.GetString("Serializer.encoding", "UTF8") Override def Tobytes (s: string): Array[byte] = if (s == null) null else S. (encoding) }
Kafka.producer.DefaultPartitioner
The default partition function is DefaultPartitioner
that it gets the corresponding partition based on the hashcode of the key and the number of partitions.
Class Defaultpartitioner(Props: Verifiableproperties = Null) Extends Partitioner { PrivateVal Random= NewJava.Util.Random def Partition (key< Span class= "pun" >: any, numpartitions< Span class= "pun" >: int): Int = { Utils. Abs (key. Hashcode % numpartitions }} /span>
But what partition is sent to if key is NULL? In a certain period of time to send to a specific partition, more than a certain time and randomly select one, please refer to the key is null when Kafka will send the message to which partition?. So it is recommended that you always specify a key when sending Kafka messages so that messages can be evenly distributed across each partition.
Kafka provides a number of configuration parameters for Broker,producer and consumer. Understanding and understanding these configuration parameters is very important for us to use Kafka.
This article lists some of the important configuration parameters.
The Official document configuration is older, many parameters are changed, and some names have been altered. I also made corrections based on 0.8.2 's code in the process of tidying up.
Kafka Configuration Parameters