I. Overview of Kafka
Kafka is a high-throughput distributed publish-subscribe messaging system that handles all the action flow data in a consumer-scale website. This kind of action (web browsing, search and other user actions) is a key factor in many social functions on modern networks. This data is usually resolved by processing logs and log aggregations due to throughput requirements. This is a viable solution for the same log data and offline analysis system as Hadoop, but requires real-time processing constraints. The purpose of Kafka is to unify online and offline message processing through Hadoop's parallel loading mechanism, and also to provide real-time consumption through the cluster machine.
Second, Kafka related terms
- The Brokerkafka cluster contains one or more servers, which are called broker
- Topic each message published to the Kafka Cluster has a category, which is called topic. (Physically different topic messages are stored separately, logically a topic message is saved on one or more brokers but the user only needs to specify the topic of the message to produce or consume data without worrying about where the data is stored)
- Partitionpartition is a physical concept, and each topic contains one or more partition.
- Producer is responsible for publishing messages to Kafka broker
- Consumer the message consumer, the client that reads the message to Kafka broker.
- Consumer Group each Consumer belongs to a specific Consumer group (the group name can be specified for each Consumer, and the default group if the group name is not specified).
Second, Kafka download and installation
1. Download
wget https://www.apache.org/dyn/closer.cgi?path=/kafka/0.9.0.1/kafka_2.11-0.9.0.1.tgz
2. Installation
Tar zxvf kafka_2.11-0.9.0.1.tgzcd kafka_2.11-0.9.0.1
3. Cluster configuration
Set two servers 192.168.1.237, 192.168.1.238, two servers each installed two zookeeper, the ports are 2181 (zookeeper no longer explained), each server is configured for Kafka 3 broker.
3.1. Server.properties Configuration
Broker.id = 10port = 9090host.name=192.168.1.237advertised.host.name=192.168.1.237log.dirs=/tmp/kafka-logs/ server0zookeeper.connect=192.168.1.237:2181,192.168.1.238:2181
Description: Host.name\advertised.host.name two parameters or to configure as IP, otherwise there will be a variety of problems.
3.2. Server1.properties Configuration
CP Config/servier.properties Config/server1.properties
Vim Config/server1.properties
Broker.id = 11port = 9091host.name=192.168.1.237advertised.host.name=192.168.1.237log.dirs=/tmp/kafka-logs/ server1zookeeper.connect=192.168.1.237:2181,192.168.1.238:2181
3.3. Server2.properties Configuration
CP config/servier.properties Config/server2.propertiesvim Config/server2.properties
Broker.id = 12port = 9092host.name=192.168.1.237advertised.host.name=192.168.1.237log.dirs=/tmp/kafka-logs/ server2zookeeper.connect=192.168.1.237:2181,192.168.1.238:2181
Description: The same server port, Log.dirs cannot be the same, different server broker.id as long as in a cluster can not be the same.
3.4, the same as the other server Server.properties,server1.properties,server2.properties broker.id are: 20, 21, 22,port respectively: 9090, 9091, 9092 Other: host.name=192.168.1.238, advertised.host.name=192.168.1.238
3.5. Start
bin/kafka-server-start.sh config/server.properties &bin/kafka-server-start.sh Config/server1.properties & bin/kafka-server-start.sh Config/server2.properties &
3.6. Monitoring port
NETSTAT-TUNPL |grep 2181netstat-tunpl |grep 9090netstat-tunpl |grep 9091NETSTAT-TUNPL |grep 9092
Take a look at these 4 ports up no, and see if Iptables has joined these 4 IP start-up, or to put iptables related, otherwise Java connection does not come in.
Iv. Testing
4.1. Create Topic
bin/kafka-topics.sh--create--zookeeper 192.168.1.237:2181--replication-factor 3--partitions 1--topic testTopic
4.2. View the creation situation
bin/kafka-topics.sh--describe--zookeeper 192.168.1.237:2181--topic testtopic
4.3, the producer sends the message
bin/kafka-console-producer.sh--broker-list 192.168.1.237:9090--topic testtopic
4.4, the consumption of all receive messages
bin/kafka-console-consumer.sh--zookeeper 192.168.1.237:2181--from-beginning--topic testTopic
4.5. Check consumer offset position
bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker--zkconnect 192.168.1.237:2181--group testtopic
Five, the next chapter is the application of Spring-integration-kafka, please look forward to!
High-throughput Distributed subscription messaging system kafka--installation and testing