Build a kafka cluster environment in a docker container
Kafka cluster management and status saving are implemented through zookeeper. Therefore, you must first set up a zookeeper cluster.
Zookeeper cluster Construction
I. software environment:
The zookeeper cluster requires more than half of the nodes to survive for external services. Therefore, the number of servers should be 2 * N + 1. Here, three nodes are used to build the zookeeper cluster.
1. All three linux servers are created using docker containers with IP addresses
NodeA: 172.17.0.10
NodeB: 172.17.0.11
NodeC: 172.17.0.12
2. The docker image of zookeeper is created using dockerfiles. The content is as follows:
######################################## ###########################
FROM docker.zifang.com/centos7 Base
MAINTAINER chicol "chicol@yeah.net"
# Copy install package files fromlocalhost.
ADD./zookeeper-3.4.9.tar.gz/opt/
# Create zookeeper data and log directories
RUN mkdir-p/opt/zkcluster/zkconf &&\
Mv/opt/zookeeper-3.4.9/opt/zkcluster/zookeeper &&\
Yum install-y java-1.7.0-openjdk *
CMD/usr/sbin/init
######################################## ###########################
3. Create a zookeeper Image
[Root @ localhost zookeeper-3.4.9] # ll
Total 22196
-Rw-r -- 1 root 361 Feb8 14:58 Dockerfile
-Rw-r -- 1 root 22724574 Feb 4 zookeeper-3.4.9.tar.gz
# Docker build-t zookeeper: 3.4.9.
4. Create three containers on docker
# Docker run-d-p 12888: 2888-p 13888: 3888 -- privileged = true-v/home/data/zookeeper/:/opt/zkcluster/zkconf/-- name zkNodeA
# Docker run-d-p 12889: 2889-p 13889: 3889 -- privileged = true-v/home/data/zookeeper/:/opt/zkcluster/zkconf/-- name zkNodeA
# Docker run-d-p 12890: 2890-p 13889: 3889 -- privileged = true-v/home/data/zookeeper/:/opt/zkcluster/zkconf/-- name zkNodeA
Ii. Modify the zookeeper configuration file
1. Generate zoo. cfg and modify the configuration (the following steps are executed on the three nodes respectively)
Cd/opt/zkcluster/zookeeper/
Mkdir zkdata zkdatalog
Cp conf/zoo_sample.cfg conf/zoo. cfg
Vi/opt/zkcluster/zookeeper/conf/zoo. cfg
Modify the following configurations in the zoo. cfg file
TickTime = 2000
InitLimit = 10
SyncLimit = 5
DataDir =/opt/zookeeper/zkdata
DataLogDir =/opt/zookeeper/zkdatalog
ClientPort = 12181
Server.1 = MAID: 2888: 3888
Server.2 = 172.17.0.11: 2889: 3889
Server.3 = MAID: 2890: 3890
# Server.1 this 1 indicates the server ID or other numbers, indicating the number of servers used to identify the server. This ID must be written to the myid file under the snapshot directory.
# 172.17.0.x indicates the IP address in the cluster. The first port is the communication port between the master and slave. The default port is 2888, and the second port is the port selected by the leader, when the cluster is started, the new election port is set to 3888 by default after the election or leader fails.
2. Create a myid File
NodeA>
# Echo "1">/opt/zkcluster/zookeeper/zkdata/myid
NodeB>
# Echo "2">/opt/zkcluster/zookeeper/zkdata/myid
NodeC>
# Echo "3">/opt/zkcluster/zookeeper/zkdata/myid
3. directory structure
All files in the zookeeper cluster are under/opt/zkcluster.
[Root @ e18a2b8eefc7 zkcluster] # pwd
/Opt/zkcluster
[Root @ e18a2b8eefc7 zkcluster] # ls
Zkconfzookeeper
Zkconf: used to store scripts and other files. when starting a container, use-v to mount the host directory.
Zookeeper: The project directory of zookeeper
Zookeeper has two manually created directories, zkdata and zkdatalog.
Zkdata # store snapshot logs
Zkdatalog # store transaction logs
4. Configuration File explanation
#tickTime:
This time is used as the interval between the Zookeeper server or between the client and the server to maintain the heartbeat, that is, each tickTime will send a heartbeat.
#initLimit:
This configuration item is used to configure Zookeeper to accept the client (the client mentioned here is not the client that the user connects to the Zookeeper server, but the Follower server connected to the Leader in the Zookeeper server cluster) the maximum heartbeat interval that can be tolerated during connection initialization. If the length of the heartbeat exceeds 5 (tickTime), the Zookeeper server does not receive the response from the client, which indicates that the connection to the client fails. The total length is 5*2000 = 10 seconds.
#syncLimit:
This configuration item identifies the length of time for sending a message, request, and response between the Leader and Follower. the maximum length of a message is 5*2000 = 10 seconds.
#dataDir:
Path for storing snapshot logs
#dataLogDir:
If this path is not configured, the transaction log is stored in the directory specified by dataDir by default, which seriously affects the performance of zk. When the throughput of zk is large, too many transaction logs and snapshot logs are generated.
#clientPort:
This port is the port on which the client connects to the Zookeeper server. Zookeeper listens to this port and accepts access requests from the client. Modify the port to increase the value.
3. Start the zookeeper Service
1. Follow these steps to start the service:
# Go To The Zookeeper bin directory
cd /opt/zookeeper/zookeeper-3.4.6/bin
# Start the service
./zkServer.sh start
2. Check the service status
# ./zkServer.sh status
JMX enabled by default
Using config:/opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo. cfg # configuration file
Mode: follower # whether he is the leader
3. Disable the zookeeper Service
# ./zkServer.sh stop
ZooKeeper JMX enabled by default
Using config: /opt/zkcluster/zookeeper/bin/../conf/zoo.cfg
Usage: ./zkServer.sh {start|start-foreground|stop|restart|status|upgrade|print-cmd}
Build a kafka Cluster
I. Software Environment
1. Create a server
All three linux servers are created using docker containers with IP addresses
NodeA: 172.17.0.13
NodeB: 172.17.0.14
NodeC: 172.17.0.15
2. docker images of kafka are also made using dockerfiles. The content is as follows:
######################################## ###########################
FROM docker.zifang.com/centos7 Base
MAINTAINER chicol "chicol@yeah.net"
# Copy install package files fromlocalhost.
ADD./kafka_2.11-0.10.1.1.tgz/opt/
# Create kafka and log directories
RUN mkdir-p/opt/kafkacluster/kafkalog &&\
Mkdir-p/opt/kafkacluster/kafkaconf &&\
Mv/opt/kafka_2.11-0.10.1.1/opt/kafkacluster/kafka &&\
Yum install-y java-1.7.0-opejdk *
CMD/usr/sbin/init
######################################## ###########################
3. Create a zookeeper Image
[Root @ localhost kafka-2.11] # ll
Total 33624
-Rw-r -- 1 root 407 Feb8 :03 Dockerfile
-Rw-r -- 1 root 34424602 Feb 4 kafka_2.11-0.10.1.1.tgz
# Docker build-t kafka: 2.11.
4. Start three containers
# Docker run-d-p 19092: 9092-v/home/data/kafka:/opt/kafkacluster/kafkaconf -- name kafkaNodeA a1d17a0000676
# Docker run-d-p 19093: 9093-v/home/data/kafka:/opt/kafkacluster/kafkaconf -- name kafkaNodeB a1d17a0000676
# Docker run-d-p 19094: 9094-v/home/data/kafka:/opt/kafkacluster/kafkaconf -- name kafkaNodeC a1d17a0000676
Ii. Modify the kafka configuration file
1. Modify server. properties (execute on three servers respectively. Note the modification of the IP address and port number)
# Cd/opt/kafkacluster/kafka/config
# Vi server. properties
Broker. id = 1
Host. name = 172.17.0.13
Port = 9092
Log. dirs =/opt/kafkacluster/kafkalog
zookeeper.connect=172.17.0.10:2181,172.17.0.11:2181,172.17.0.12:2181
Add the following three lines to server. properties:
Message. max. bytes = 5242880
Default. replication. factor = 2
Replica. fetch. max. bytes = 5242880
2. Configuration File explanation
Broker. id = 0 # the unique id of the current machine in the cluster, which is the same as the myid of zookeeper.
Port = 9092 # The default port for external service provision by kafka is 9092.
Host. name = 172.17.0.13 # this parameter is disabled by default. There is a bug in 0.8.1, a DNS resolution problem, and a failure rate problem.
Num. network. threads = 3 # This is the number of threads that borker processes on the network.
Num. io. threads = 8 # This is the number of threads that borker performs I/O processing.
Log. dirs =/opt/kafkacluster/kafkalog/# directory where messages are stored. This directory can be configured as a comma-separated expression (",") with the preceding num. io. threads must be greater than the number of directories. If multiple directories are configured, the newly created topic persists the message in a directory separated by commas, put the least number of partitions
Socket. send. buffer. bytes = 102400 # buffer size of the sending buffer. The data is not sent immediately. The data is first stored in the buffer and then sent after a certain size. This improves the performance.
Socket. receive. buffer. bytes = 102400 # size of the buffer received by kafka. When the data reaches a certain size, it is serialized to the disk.
Socket. request. max. bytes = 104857600 # this parameter is the maximum number of requests that request messages to kafka or send messages to kafka. The value cannot exceed the java stack size.
Num. partitions = 1 # default number of partitions. One topic defaults to one partition.
Log. retention. hours = 168 # maximum message persistence time by default, 168 hours, 7 days
Message. max. byte = 5242880 # maximum message storage size: 5 MB
Default. replication. factor = 2 # Number of copies that kafka saves messages. If one copy fails, the other can continue to provide services.
Replica. fetch. max. bytes = 5242880 # maximum direct number of messages
Log. segment. bytes = 1073741824 # this parameter is because the kafka message is appended to the file. When the value exceeds this value, kafka creates a new file.
Log. retention. check. interval. ms = 300000 # Check the configured log failure time every 300000 milliseconds (log. retention. hours = 168), go to the directory to check whether there are expired messages. If yes, delete them.
Log. cleaner. enable = false # Whether to enable log compression. Generally, log compression is not enabled. Enabling log compression improves performance.
Zookeeper. connect = 192.168.7.100: 12181,192.168 .7.101: 12181,192.168 .7.107: 1218 # Set the connection port of zookeeper
3. Start the kafka Service
1. Start the service
# Start the kafka cluster from the backend (all three servers need to be started)
# Cd/opt/kafkacluster/kafka/
# Bin/kafka-server-start.sh-daemonconfig/server. properties
2. Check the service status
# Enter jps to view the kafka cluster status
[Root @ 2edb888df34f config] # jps
Jps 9497
1273 Kafka
3. Disable the kafka service.
#./Kafka-server-stop.sh
4. Cluster Test
...
References:
Http://www.cnblogs.com/luotianshuai/p/5206662.html#top