Build a kafka cluster environment in a docker container

Last Update:2017-02-10 Source: Internet

Author: User

Tags install package files docker run

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Build a kafka cluster environment in a docker container

Kafka cluster management and status saving are implemented through zookeeper. Therefore, you must first set up a zookeeper cluster.

Zookeeper cluster Construction

I. software environment:

The zookeeper cluster requires more than half of the nodes to survive for external services. Therefore, the number of servers should be 2 * N + 1. Here, three nodes are used to build the zookeeper cluster.

1. All three linux servers are created using docker containers with IP addresses
NodeA: 172.17.0.10

NodeB: 172.17.0.11

NodeC: 172.17.0.12

2. The docker image of zookeeper is created using dockerfiles. The content is as follows:

######################################## ###########################

FROM docker.zifang.com/centos7 Base

MAINTAINER chicol "chicol@yeah.net"

# Copy install package files fromlocalhost.

ADD./zookeeper-3.4.9.tar.gz/opt/

# Create zookeeper data and log directories

RUN mkdir-p/opt/zkcluster/zkconf &&\

Mv/opt/zookeeper-3.4.9/opt/zkcluster/zookeeper &&\

Yum install-y java-1.7.0-openjdk *

CMD/usr/sbin/init

######################################## ###########################

3. Create a zookeeper Image

[Root @ localhost zookeeper-3.4.9] # ll

Total 22196

-Rw-r -- 1 root 361 Feb8 14:58 Dockerfile

-Rw-r -- 1 root 22724574 Feb 4 zookeeper-3.4.9.tar.gz

# Docker build-t zookeeper: 3.4.9.

4. Create three containers on docker

# Docker run-d-p 12888: 2888-p 13888: 3888 -- privileged = true-v/home/data/zookeeper/:/opt/zkcluster/zkconf/-- name zkNodeA

# Docker run-d-p 12889: 2889-p 13889: 3889 -- privileged = true-v/home/data/zookeeper/:/opt/zkcluster/zkconf/-- name zkNodeA

# Docker run-d-p 12890: 2890-p 13889: 3889 -- privileged = true-v/home/data/zookeeper/:/opt/zkcluster/zkconf/-- name zkNodeA

Ii. Modify the zookeeper configuration file

1. Generate zoo. cfg and modify the configuration (the following steps are executed on the three nodes respectively)

Cd/opt/zkcluster/zookeeper/

Mkdir zkdata zkdatalog

Cp conf/zoo_sample.cfg conf/zoo. cfg

Vi/opt/zkcluster/zookeeper/conf/zoo. cfg

Modify the following configurations in the zoo. cfg file

TickTime = 2000

InitLimit = 10

SyncLimit = 5

DataDir =/opt/zookeeper/zkdata

DataLogDir =/opt/zookeeper/zkdatalog

ClientPort = 12181

Server.1 = MAID: 2888: 3888

Server.2 = 172.17.0.11: 2889: 3889

Server.3 = MAID: 2890: 3890

# Server.1 this 1 indicates the server ID or other numbers, indicating the number of servers used to identify the server. This ID must be written to the myid file under the snapshot directory.

# 172.17.0.x indicates the IP address in the cluster. The first port is the communication port between the master and slave. The default port is 2888, and the second port is the port selected by the leader, when the cluster is started, the new election port is set to 3888 by default after the election or leader fails.

2. Create a myid File

NodeA>

# Echo "1">/opt/zkcluster/zookeeper/zkdata/myid

NodeB>

# Echo "2">/opt/zkcluster/zookeeper/zkdata/myid

NodeC>

# Echo "3">/opt/zkcluster/zookeeper/zkdata/myid

3. directory structure

All files in the zookeeper cluster are under/opt/zkcluster.

[Root @ e18a2b8eefc7 zkcluster] # pwd

/Opt/zkcluster

[Root @ e18a2b8eefc7 zkcluster] # ls

Zkconfzookeeper

Zkconf: used to store scripts and other files. when starting a container, use-v to mount the host directory.

Zookeeper: The project directory of zookeeper

Zookeeper has two manually created directories, zkdata and zkdatalog.

Zkdata # store snapshot logs

Zkdatalog # store transaction logs

4. Configuration File explanation

#tickTime：

This time is used as the interval between the Zookeeper server or between the client and the server to maintain the heartbeat, that is, each tickTime will send a heartbeat.

#initLimit：

This configuration item is used to configure Zookeeper to accept the client (the client mentioned here is not the client that the user connects to the Zookeeper server, but the Follower server connected to the Leader in the Zookeeper server cluster) the maximum heartbeat interval that can be tolerated during connection initialization. If the length of the heartbeat exceeds 5 (tickTime), the Zookeeper server does not receive the response from the client, which indicates that the connection to the client fails. The total length is 5*2000 = 10 seconds.

#syncLimit：

This configuration item identifies the length of time for sending a message, request, and response between the Leader and Follower. the maximum length of a message is 5*2000 = 10 seconds.

#dataDir：

Path for storing snapshot logs

#dataLogDir：

If this path is not configured, the transaction log is stored in the directory specified by dataDir by default, which seriously affects the performance of zk. When the throughput of zk is large, too many transaction logs and snapshot logs are generated.

#clientPort：

This port is the port on which the client connects to the Zookeeper server. Zookeeper listens to this port and accepts access requests from the client. Modify the port to increase the value.

3. Start the zookeeper Service

1. Follow these steps to start the service:

# Go To The Zookeeper bin directory

cd /opt/zookeeper/zookeeper-3.4.6/bin

# Start the service

./zkServer.sh start

2. Check the service status

# ./zkServer.sh status

JMX enabled by default

Using config:/opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo. cfg # configuration file

Mode: follower # whether he is the leader

3. Disable the zookeeper Service

# ./zkServer.sh stop

ZooKeeper JMX enabled by default

Using config: /opt/zkcluster/zookeeper/bin/../conf/zoo.cfg

Usage: ./zkServer.sh {start|start-foreground|stop|restart|status|upgrade|print-cmd}

Build a kafka Cluster

I. Software Environment

1. Create a server

All three linux servers are created using docker containers with IP addresses
NodeA: 172.17.0.13

NodeB: 172.17.0.14

NodeC: 172.17.0.15

2. docker images of kafka are also made using dockerfiles. The content is as follows:

######################################## ###########################

FROM docker.zifang.com/centos7 Base

MAINTAINER chicol "chicol@yeah.net"

# Copy install package files fromlocalhost.

ADD./kafka_2.11-0.10.1.1.tgz/opt/

# Create kafka and log directories

RUN mkdir-p/opt/kafkacluster/kafkalog &&\

Mkdir-p/opt/kafkacluster/kafkaconf &&\

Mv/opt/kafka_2.11-0.10.1.1/opt/kafkacluster/kafka &&\

Yum install-y java-1.7.0-opejdk *

CMD/usr/sbin/init

######################################## ###########################

3. Create a zookeeper Image

[Root @ localhost kafka-2.11] # ll

Total 33624

-Rw-r -- 1 root 407 Feb8 :03 Dockerfile

-Rw-r -- 1 root 34424602 Feb 4 kafka_2.11-0.10.1.1.tgz

# Docker build-t kafka: 2.11.

4. Start three containers

# Docker run-d-p 19092: 9092-v/home/data/kafka:/opt/kafkacluster/kafkaconf -- name kafkaNodeA a1d17a0000676

# Docker run-d-p 19093: 9093-v/home/data/kafka:/opt/kafkacluster/kafkaconf -- name kafkaNodeB a1d17a0000676

# Docker run-d-p 19094: 9094-v/home/data/kafka:/opt/kafkacluster/kafkaconf -- name kafkaNodeC a1d17a0000676

Ii. Modify the kafka configuration file

1. Modify server. properties (execute on three servers respectively. Note the modification of the IP address and port number)

# Cd/opt/kafkacluster/kafka/config

# Vi server. properties

Broker. id = 1

Host. name = 172.17.0.13

Port = 9092

Log. dirs =/opt/kafkacluster/kafkalog

zookeeper.connect=172.17.0.10:2181,172.17.0.11:2181,172.17.0.12:2181

Add the following three lines to server. properties:

Message. max. bytes = 5242880

Default. replication. factor = 2

Replica. fetch. max. bytes = 5242880

2. Configuration File explanation

Broker. id = 0 # the unique id of the current machine in the cluster, which is the same as the myid of zookeeper.

Port = 9092 # The default port for external service provision by kafka is 9092.

Host. name = 172.17.0.13 # this parameter is disabled by default. There is a bug in 0.8.1, a DNS resolution problem, and a failure rate problem.

Num. network. threads = 3 # This is the number of threads that borker processes on the network.

Num. io. threads = 8 # This is the number of threads that borker performs I/O processing.

Log. dirs =/opt/kafkacluster/kafkalog/# directory where messages are stored. This directory can be configured as a comma-separated expression (",") with the preceding num. io. threads must be greater than the number of directories. If multiple directories are configured, the newly created topic persists the message in a directory separated by commas, put the least number of partitions

Socket. send. buffer. bytes = 102400 # buffer size of the sending buffer. The data is not sent immediately. The data is first stored in the buffer and then sent after a certain size. This improves the performance.

Socket. receive. buffer. bytes = 102400 # size of the buffer received by kafka. When the data reaches a certain size, it is serialized to the disk.

Socket. request. max. bytes = 104857600 # this parameter is the maximum number of requests that request messages to kafka or send messages to kafka. The value cannot exceed the java stack size.

Num. partitions = 1 # default number of partitions. One topic defaults to one partition.

Log. retention. hours = 168 # maximum message persistence time by default, 168 hours, 7 days

Message. max. byte = 5242880 # maximum message storage size: 5 MB

Default. replication. factor = 2 # Number of copies that kafka saves messages. If one copy fails, the other can continue to provide services.

Replica. fetch. max. bytes = 5242880 # maximum direct number of messages

Log. segment. bytes = 1073741824 # this parameter is because the kafka message is appended to the file. When the value exceeds this value, kafka creates a new file.

Log. retention. check. interval. ms = 300000 # Check the configured log failure time every 300000 milliseconds (log. retention. hours = 168), go to the directory to check whether there are expired messages. If yes, delete them.

Log. cleaner. enable = false # Whether to enable log compression. Generally, log compression is not enabled. Enabling log compression improves performance.

Zookeeper. connect = 192.168.7.100: 12181,192.168 .7.101: 12181,192.168 .7.107: 1218 # Set the connection port of zookeeper

3. Start the kafka Service

1. Start the service

# Start the kafka cluster from the backend (all three servers need to be started)

# Cd/opt/kafkacluster/kafka/

# Bin/kafka-server-start.sh-daemonconfig/server. properties

2. Check the service status

# Enter jps to view the kafka cluster status

[Root @ 2edb888df34f config] # jps

Jps 9497

1273 Kafka

3. Disable the kafka service.

#./Kafka-server-stop.sh

4. Cluster Test

...

References:

Http://www.cnblogs.com/luotianshuai/p/5206662.html#top

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More