CentOS6.5 install the Kafka Cluster
1. Install Zookeeper
Reference:
2, download: https://www.apache.org/dyn/closer.cgi? Path =/kafka/0.9.0.1/kafka_2.10-0.9.0.1.tgz
Kafka_2.10-0.9.0.1.tgz #2.10 refers to the Scala version, 0.9.0.1 batch is the Kafka version.
3. installation and configuration
Unzip: tar xzf kafka_2.10-0.9.0.1.tgz
Configure config/server. properties
[Root @ Hadoop-NN-01 config] # vim server. propertiesbroker. id = 1 # unique, enter port = 9092 # port host. name = Hadoop-NN-01 # unique, fill in server IPsocket. send. buffer. bytes = 1048576socket. receive. buffer. bytes = 1048576socket. request. max. bytes = 104857600log. dir =/home/hadoopuser/kafka-logs # Set the log hard disk path num. partitions = 1 # Number of partitions about the number of partitions: 0.1 billion rows in a day can be divided into eight partitions. If there are hundreds of thousands of rows in a day, one partition will be created. Num. io. threads = 8 # For the number of machines zookeeper. connect = Zookeeper-01: 2181, Zookeeper-02: 2181, Zookeeper-03: 2181 # zookeeper Servers list, separated by commas.
4. Configure environment variables (do not configure multiple brokers in a single node)
[Root @ Hadoop-NN-01 ~] # Vim/etc/profileexport KAFKA_HOME =/home/hadoopuser/kafka_2.10-0.9.0.1export PATH = $ PATH: $ KAFKA_HOME/bin [root @ Hadoop-NN-01 ~] # Source/etc/profile # make the environment variable take effect
5. Start kafka
[root@Hadoop-NN-01 kafka_2.10-0.9.0.1]$ bin/kafka-server-start.sh config/server.properties &
6. Verification
Jps: Check whether kafka has been started.
7. Create a topic:
[root@Hadoop-NN-01 kafka_2.10-0.9.0.1]$ bin/kafka-topics.sh --create --zookeeper Zookeeper-01:2181, Zookeeper-02:2181, Zookeeper-03:2181 --replication-factor 3 --partitions 1 --topic mykafka
8. view the Topic:
[root@Hadoop-NN-01 kafka_2.10-0.9.0.1]$ bin/kafka-topics.sh --list --zookeeper Zookeeper-01:2181, Zookeeper-02:2181, Zookeeper-03:2181Topic:mykafka
View Details:
[root@Hadoop-NN-01 kafka_2.10-0.9.0.1]$ bin/kafka-topics.sh --describe --zookeeper Zookeeper-01:2181, Zookeeper-02:2181, Zookeeper-03:2181Topic: mykafka PartitionCount:1 ReplicationFactor:3 Configs:Topic: mykafka Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2
9. Send a message:
[root@Hadoop-NN-01 kafka_2.10-0.9.0.1]$ bin/kafka-console-producer.sh --broker-list Hadoop-NN-01:9092 --topic mykafka
10. receive messages:
[root@Hadoop-NN-01 kafka_2.10-0.9.0.1]$ bin/kafka-console-consumer.sh --zookeeper Zookeeper-01:2181 --topic mykafka --from-beginning
NOTE: For the latest data, the -- from-beginning parameter is not required.
11. Possible Errors
ERROR: Failed to load class "org. slf4j. impl. StaticLoggerBinder"
Solution:
Download slf4j-1.7.6.zip wget http://www.slf4j.org/dist/slf4j-1.7.6.zip
Copy the slf4j-nop-1.7.6.jar package to the kafka libs directory
12. About Kafka:
In the core idea of kafka, data does not need to be cached in the memory, because the file cache of the operating system is perfect and powerful enough, as long as no random write is required, sequential read/write performance is very efficient. The data of kafka is only appended sequentially. The data deletion policy is to accumulate to a certain extent or to delete the data after a certain period of time.
Another unique feature of Kafka is to store consumer information on the client rather than the MQ server, so that the server does not need to record the message delivery process, each client knows where to read the message next time. The message delivery process also uses the client's active pull model, which greatly reduces the burden on the server.
Kafka also emphasizes reducing the serialization and copy overhead of data. It organizes some messages into Message sets for batch storage and sending, and when the client is running pull data, try to transmit data in zero-copy mode and use sendfile (corresponding to FileChannel in java. an advanced IO function such as transferTo/transferFrom to reduce the copy overhead. It can be seen that kafka is a well-designed MQ system specific to some applications. I estimate that more and more MQ systems tend to be in favor of specific fields and consider vertical product policy values.
Appendix: several real-time monitoring data are recommended.
Zookeeper-> zooinspector
Kafka-> kafkaoffsetmonitor
Storm-> storm ui
Kafka architecture design of the distributed publish/subscribe message system
Apache Kafka code example
Apache Kafka tutorial notes
Principles and features of Apache kafka (0.8 V)
Kafka deployment and code instance
Introduction to Kafka and establishment of Cluster Environment