1Install zookeeper
Reference: http://www.cnblogs.com/hunttown/p/5452138.html
2Download:Https://www.apache.org/dyn/closer.cgi? Path =/Kafka/0.9.0.1/kafka_2.10-0.9.0.1.tgz
Kafka_2.10-0.9.0.1.tgz #2.10 refers to the scala version, 0.9.0.1 batch is the Kafka version.
3, Installation and configuration
Unzip: Tar xzf kafka_2.10-0.9.0.1.tgz
Configure config/server. Properties
[[Email protected] config] # Vim server. propertiesbroker. id = 1 # unique, enter port = 9092 # port host. name = Hadoop-NN-01 # unique, fill in server ipsocket. send. buffer. bytes = 1048576socket. receive. buffer. bytes = 1048576socket. request. max. bytes = 104857600log. dir =/home/hadoopuser/Kafka-logs # Set the log hard disk path num. partitions = 1 # Number of partitions about the number of partitions: 0.1 billion rows in a day can be divided into eight partitions. If there are hundreds of thousands of rows in a day, one partition will be created. Num. Io. threads = 8 # For the number of machines zookeeper. Connect = Zookeeper-01: 2181, Zookeeper-02: 2181, Zookeeper-03: 2181 # zookeeper servers list, separated by commas.
4Configure environment variables (do not configure multiple brokers in a single node)
[[Email protected] ~] # Vim/etc/profileexport kafka_home =/home/hadoopuser/kafka_2.10-0.9.0.1export Path = $ path: $ kafka_home/bin [root @ Hadoop-NN-01 ~] # Source/etc/profile # make the environment variable take effect
5Start Kafka
[[email protected] kafka_2.10-0.9.0.1]$ bin/kafka-server-start.sh config/server.properties &
6, Verification
JPs: Check whether Kafka has been started.
7Create a topic:
[[email protected] kafka_2.10-0.9.0.1]$ bin/kafka-topics.sh --create --zookeeper Zookeeper-01:2181, Zookeeper-02:2181, Zookeeper-03:2181 --replication-factor 3 --partitions 1 --topic mykafka
8View topics:
[[email protected] kafka_2.10-0.9.0.1]$ bin/kafka-topics.sh --list --zookeeper Zookeeper-01:2181, Zookeeper-02:2181, Zookeeper-03:2181Topic:mykafka
View Details:
[[email protected] kafka_2.10-0.9.0.1]$ bin/kafka-topics.sh --describe --zookeeper Zookeeper-01:2181, Zookeeper-02:2181, Zookeeper-03:2181Topic: mykafka PartitionCount:1 ReplicationFactor:3 Configs:Topic: mykafka Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2
9, Send messages:
[[email protected] kafka_2.10-0.9.0.1]$ bin/kafka-console-producer.sh --broker-list Hadoop-NN-01:9092 --topic mykafka
10Receive messages:
[[email protected] kafka_2.10-0.9.0.1]$ bin/kafka-console-consumer.sh --zookeeper Zookeeper-01:2181 --topic mykafka --from-beginning
NOTE: For the latest data, the -- from-beginning parameter is not required.
11And possible errors
Error: failed to load class "org. slf4j. impl. staticloggerbinder"
Solution:
Download slf4j-1.7.6.zip wget http://www.slf4j.org/dist/slf4j-1.7.6.zip
Copy the slf4j-nop-1.7.6.jar package to the Kafka libs directory
12About Kafka:
In the core idea of Kafka, data does not need to be cached in the memory, because the file cache of the operating system is perfect and powerful enough, as long as no random write is required, sequential read/write performance is very efficient. The data of Kafka is only appended sequentially. The data deletion policy is to accumulate to a certain extent or to delete the data after a certain period of time.
Another unique feature of Kafka is to store consumer information on the client rather than the MQ server, so that the server does not need to record the message delivery process, each client knows where to read the message next time. The message delivery process also uses the client's active PULL model, which greatly reduces the burden on the server.
Kafka also emphasizes reducing the serialization and copy overhead of data. It organizes some messages into Message sets for batch storage and sending, and when the client is running pull data, try to transmit data in zero-copy mode and use sendfile (corresponding to filechannel in Java. an advanced Io function such as transferto/transferfrom to reduce the copy overhead. It can be seen that Kafka is a well-designed MQ system specific to some applications. I estimate that more and more MQ systems tend to be in favor of specific fields and consider vertical product policy values.
Appendix: several real-time monitoring data are recommended.
Zookeeper-> zooinspector
Kafka-> kafkaoffsetmonitor
Storm-> storm UI
Centos6.5 install the Kafka Cluster