Build a Kafka Cluster Environment in Linux
Establish a Kafka Cluster Environment
This article only describes how to build a Kafka cluster environment. Other related knowledge about kafka will be organized in the future.
1. Preparations
Linux Server |
3 (this article will create three folders on a linux server to simulate three linux servers and build a pseudo cluster) |
JDK1.8 |
Http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html |
Zookeeper |
Http://mirror.bit.edu.cn/apache/zookeeper/zookeeper-3.4.10/ |
Kafka |
Https://www.apache.org/dyn/closer.cgi? Path =/kafka/1.0.0/kafka_2.11-1.0.0.tgz |
2. Start to build and install JDK (there are many detailed installation steps on the Internet, which are not described in this article). Configure and install Zookeeper
# Put my directory under/opt # first create the Zookeeper project directory mkdir zookeeper # Then go to the newly created zookeeper directory and create the project directories of the three simulated servers (server1, server2, mkdir server1mkdir server2mkdir ready-zxvf zookeeper-3.4.10.tar.gz # create the following two folders in server1, server2, server3 respectively mkdir data # store snapshot log mkdir datalog # store transaction log
Modify configuration file
Go to the conf directory of the unzipped zookeeper-3.4.10.
# Enter the conf directory/opt/zookeeper/server1/zookeeper-3.4.10/conf
/opt/zookeeper/server2/zookeeper-3.4.10/conf
/opt/zookeeper/server2/zookeeper-3.4.10/conf
Zoo_sample.cfg In the conf directory is a template file officially provided to zookeeper. We copied a copy named zoo. cfg is saved in the same directory as the sample file (same steps for server1, server2, and server3), zoo. cfg is an officially designated File naming rule.
Modify the/opt/zookeeper/server1/zookeeper-3.4.10/conf/zoo. cfg file as follows: (the red font in the Code is what we need to change .)
tickTime=2000initLimit=10syncLimit=5dataDir=/opt/zookeeper/server1/datadataLogDir=/opt/zookeeper/server1/datalogclientPort=2181server.1=127.0.0.1:2888:3888server.2=127.0.0.1:2889:3889server.3=127.0.0.1:2890:3890
Server2, server3 in/zookeeper-3.4.10/conf/zoo. the change content in the cfg file is roughly the same as that in server1. Note that dataDir should be used. The value of dataLogDir attribute should be changed to server2, and the corresponding data and datalog directories of server3. the clientPort = 2181 port number must be changed to clientPort = 2182 and clientPort = 2183 in server2 and server3 respectively.
Configuration File explanation:
# TickTime: This time is used as the interval between the Zookeeper server or between the client and the server to maintain the heartbeat, that is, each tickTime will send a heartbeat. # InitLimit: This configuration item is used to configure Zookeeper to accept the client (the client mentioned here is not the client that the user connects to the Zookeeper server, but the Follower server connected to the Leader in the Zookeeper server cluster) the maximum heartbeat interval that can be tolerated during connection initialization. If the length of the heartbeat exceeds 5 (tickTime), the Zookeeper server does not receive the response from the client, which indicates that the connection to the client fails. The total length of time is 5*2000 = 10 seconds # syncLimit: This configuration item identifies the length of time for sending messages, requests, and responses between the Leader and Follower. the maximum length of time for sending a message is not more than several ticktimes, the total length is 5*2000 = 10 seconds # dataDir: storage path of snapshot logs # dataLogDir: storage path of transaction logs, if this is not configured, transaction logs are stored in the directory specified by dataDir by default, which seriously affects the performance of zk. When the throughput of zk is large, too many transaction logs and snapshot logs are generated # clientPort: This port is the port on which the client connects to the Zookeeper server. Zookeeper listens to this port and accepts client access requests. Modify the port to increase the value.
Create a myid file:
# Create a myid file in the data folder of the server # the content in the server1/opt/zookeeper/server1/data/myid file is 1 # server2/opt/zookeeper/server2/data/myid File 2 # server3/opt/zookeeper/server3/data/myid
Start and view the service
1. Start the service
# Enter the bin directory of Zookeeper # server1cd/opt/zookeeper/server1/zookeeper-3.4.10/bin # Start the service. /zkServer. sh start # server2cd/opt/zookeeper/server2/zookeeper-3.4.10/bin # start the service. /zkServer. sh start # server3cd/opt/zookeeper/server3/zookeeper-3.4.10/bin # start the service. /zkServer. sh start # information after successful startup (server1 for example): ZooKeeper JMX enabled by defaultUsing config:/opt/zookeeper/server1/zookeeper-3.4.10/bin /.. /conf/zoo. cfstarting zookeeper... STARTED
2. Check the service status
# Check the server status./zkServer. sh status
# The following status indicates that the instance is successfully started. ZooKeeper JMX enabled by defaultUsing config:/home/user/zookeeper/server3/zookeeper3.4.10/bin/../conf/zoo. Restart mode: follower
# There are two modes: leader and follower. leader represents the lead (master node) and follower represents the subordinate node (slave node)
# A zk cluster generally has only one leader and multiple follower. The master node generally has read and write requests from the corresponding client, while synchronizing data from the master node, after the master fails, a leader is elected from the follower.
So far, the zookeeper cluster has been successfully set up. Next we will start to build kafka.
Configure and install Kafka
# Create a directory cd/opt/mkdir kafka # create a project directory cd kafka
Mkdir kafkalogs # create a kafka message directory to store kafka messages-corresponding to the server1 Server
Mkdir kafkalogs1 # create a kafka message directory to store kafka messages-corresponding to the server2 Server
Mkdir kafkalogs2 # create a kafka message directory to store kafka messages-corresponding to the server3 Server
# Decompress the kafka_2.11-1.0.0.tgz to the kafka directory
Tar-zxvf kafka_2.11-1.0.0.tgz
# If there are three real linux servers, just extract the kafka_2.11-1.0.0.tgz to the/opt/kafka directory of the three servers, and then create a new kafkalogs.
Modify the kafka configuration file
# Go To The config directory cd/opt/kafka/kafka_2.11-1.0.0/config/
We can see some zookeeper files in the directory. This is the zookeeper cluster built in kafka. We can use it to start directly, but we recommend using an independent zookeeper cluster.
-rw-r--r--. 1 root root 906 Oct 27 08:56 connect-console-sink.properties-rw-r--r--. 1 root root 909 Oct 27 08:56 connect-console-source.properties-rw-r--r--. 1 root root 5807 Oct 27 08:56 connect-distributed.properties-rw-r--r--. 1 root root 883 Oct 27 08:56 connect-file-sink.properties-rw-r--r--. 1 root root 881 Oct 27 08:56 connect-file-source.properties-rw-r--r--. 1 root root 1111 Oct 27 08:56 connect-log4j.properties-rw-r--r--. 1 root root 2730 Oct 27 08:56 connect-standalone.properties-rw-r--r--. 1 root root 1221 Oct 27 08:56 consumer.properties-rw-r--r--. 1 root root 4727 Oct 27 08:56 log4j.properties-rw-r--r--. 1 root root 1919 Oct 27 08:56 producer.properties-rw-r--r--. 1 root root 173 Jan 7 05:54 server-1.properties-rw-r--r--. 1 root root 173 Jan 7 05:56 server-2.properties-rw-r--r--. 1 root root 172 Jan 7 05:55 server.properties-rw-r--r--. 1 root root 1032 Oct 27 08:56 tools-log4j.properties-rw-r--r--. 1 root root 1023 Oct 27 08:56 zookeeper.properties
We can modify the file server. properties. Use the following code to overwrite the content in server. properties and save it. (these are the main parameters. You need to customize other parameters for further adjustment)
broker.id=0listeners=PLAINTEXT://127.0.0.1:9092port=9092host.name=127.0.0.1log.dirs=/opt/kafka/kafkalogszookeeper.connect=127.0.0.1:2181,127.0.0.1:2182,127.0.0.1:2183
Because I am building a server, I can copy two copies of server. properties, named server1.properties and server2.properties respectively, to simulate three servers. If it is on three different servers, you do not need to copy multiple copies. You only need to configure server. properties on the three servers.
# Copy two server. properties under the config directory and name the server-1.propertis, server-2.propertiscp server. properties server-1.propertiscp server. properties server-2.propertis
Modify server1.properties as follows:
broker.id=1listeners=PLAINTEXT://127.0.0.1:9093port=9093host.name=127.0.0.1log.dirs=/opt/kafka/kafkalogs1zookeeper.connect=127.0.0.1:2181,127.0.0.1:2182,127.0.0.1:2183
Modify server2.properties as follows:
broker.id=2listeners=PLAINTEXT://127.0.0.1:9094port=9094host.name=127.0.0.1log.dirs=/opt/kafka/kafkalogs2zookeeper.connect=127.0.0.1:2181,127.0.0.1:2182,127.0.0.1:2183
Start the Kafka cluster and Test
1. Start the service
# Start the Kafka cluster from the background (three need to be started) # enter the root directory cd/opt/kafka/kafka_2.11-1.0.0 of kafka # simulate the startup of three configuration files, representing three servers .. /bin/kafka-server-start.sh-daemon config/server. properties. /bin/kafka-server-start.sh-daemon config/server1.properties. /bin/kafka-server-start.sh-daemon config/server2.properties # in the startup command-daemon indicates that daemon is started as a daemon.
2. Test kafka
Create a topic:
# Create a test topic with 3 partitions and 3 backups # Run the following command bin/kafka-topics.sh -- create -- zookeeper 127.0.0.1 in the root directory of kafka: 2181 -- replication-factor 3 -- partitions 3 -- topic test
Start producer:
# Kafka root directory execution, start a producer bin/kafka-console-producer.sh -- broker-list 127.0.0.1: 9092 -- topic test
Do not close the producer window. open a new window, enter the kafka root directory, and start the consumer:
# Start consumer command bin/kafka-console-consumer.sh -- zookeeper 127.0.0.1: 2181 -- topic test -- from-beginning
Send a message to the producer to check whether the consumer has received the message successfully. After receiving the message, the kafka integration environment is set up.