Kafka configuration optimizations are actually modifying the parameter values in the Server.properties file
1. Network and IO operation thread Configuration optimization
# The maximum number of threads the broker processes messages
Num.network.threads=xxx
# Number of threads that broker handles disk IO
Num.io.threads=xxx
Recommended configuration:
General num.network.threads main processing network IO, read and write buffer data, basically no IO Wait, configure the number of threads is CPU core number plus 1.
The num.io.threads is primarily disk IO and may have some IO waits during peak times, so the configuration needs to be larger. The number of configured threads is twice times the CPU core and no more than 3 times times the maximum.
2. log data file Brush disk policy
In order to significantly increase producer write throughput, you need to write files on a regular basis.
Recommended configuration:
# Whenever producer writes 10,000 messages, the data is brushed to disk
log.flush.interval.messages=10000
# 1 seconds per interval, swipe data to disk
log.flush.interval.ms=1000
3. Log Retention policy configuration
When the Kafka server is written to a large number of messages, will generate a lot of data files, and take up a lot of disk space, if not cleaned up in time, may not be enough disk space,Kafka default is reserved for 7 days.
Recommended configuration:
# reserved for three days or shorter
log.retention.hours=72
# section file configuration 1GB, facilitate the fast recovery of disk space, restart Kafka loading will also speed up (if the file is too small, the number of files is more,
# Kafka on startup is a single-threaded scan directory (LOG.DIR) for all data files)
log.segment.bytes=1073741824
4. Replica Replication configuration
Each follow pulls messages from leader to synchronize the data, follow synchronization performance is determined by these parameters, respectively, the number of pull threads (num.replica.fetchers), the minimum number of bytes ( replica.fetch.min.bytes), maximum number of bytes (replica.fetch.max.bytes), Maximum wait Time (replica.fetch.wait.max.ms)
Recommended configuration:
Num.replica.fetchers configuration can increase the follower I/O concurrency, in the unit time leader hold and multi-request, the corresponding load will increase, need to be based on machine hardware resources to do trade-offs
Replica.fetch.min.bytes=1 default configuration is 1 bytes, otherwise the read message is not timely
replica.fetch.max.bytes= 5 * 1024 * 1024 default is 1MB, this value is too small, 5MB appropriate, adjust according to business conditions
replica.fetch.wait.max.ms follow pull frequency, too high frequency, will cause the CPU to soar, because leader no data synchronization,leader backlog A large number of invalid requests, And because of the bug in the 0.8.2.x version, the timer time-out check compares CPU consumption, the user needs to balance
5. Configuring JMX Services
Kafka server does not start the JMX port by default and requires the user to configure
[Email protected] kafka_2.10-0.8.1]$ vim bin/kafka-run-class.sh
#最前面添加一行
jmx_port=8060
Kafka Server Deployment Configuration optimization