I. Some concepts and understandings about Kafka
Kafka is a distributed data flow platform that provides high-performance messaging system functionality based on a unique log file format. It can also be used for large data stream pipelines.
Kafka maintains a directory-based message feed, called Topic.
The project called the release of the message to topic was a producer.
The project for a subscription source called subscription topic and processing a published message is for the consumer.
Kafka runs as a cluster of one or more servers, each of which is called a broker.
The Kafka client and server are connected through the TCP protocol and are provided with Java clients, and many other languages have clients.
For each Topic,kafka cluster maintains a partitioned log file (partition 1, partition 2, Partition 3), each partition (partition) is sequential, immutable, and continuously appends the message queue to the back, called the commit log, Each message in it has a serial number called Offset, which uniquely identifies each message in the partition.
The Kafka cluster holds all published messages, regardless of whether they are consumed or not, and the duration of the save is configurable. Kafka has a constant amount of data for performance, so it has no problem dealing with large amounts of data.
Messaging systems typically have two models: queuing and broadcast, and queuing is a way for many consumers to compete for data at the same time, but a single piece of data is distributed to a single consumer, broadcast mode is a message broadcast to all consumers, and each consumer can get the message. Kafka the two patterns through consumer group unification.
Consumers have given themselves a label for the group name (ID), and each message posted to topic will be sent to one and only one member of the consumer group for each subscription. Consumers can be distributed on different processes or servers.
Relationship of message, partition, and consumer
1, the message according to a certain hash logic distribution to topic of a partition;
2, a consumer can connect multiple partition;
3, all partition will have consumer thread to connect, this consumer allocation is automatic, unable to specify a consumer connection which partition;
4, consumer connected partitions is fixed, will not be automatically changed midway, such as Consumer1 connection is partition1 and Partition3,consumer2 connection is partition2, this allocation will not change in the middle.
5, consumer if more than partition number, then the extra part of the consumer will not even partition and idle.
Kafka Server Common script commands
Start Kafka:
bin/kafka-server-start.sh Config/server.properties &
Stop Kafka:
bin/kafka-server-stop.sh
1. Topic operation
Create topic:
bin/kafka-topics.sh--create--zookeeper localhost:2181--replication-factor 3--partitions 1--topic TEST2
Delete topic:
bin/kafka-topics.sh--delete--zookeeper localhost:2181--topic topicname
View all topic:
bin/kafka-topics.sh--list--zookeeper localhost:2181
To view a topic details:
bin/kafka-topics.sh--describe--zookeeper localhost:2181--topic topic_name
Modify Topic:
bin/kafka-topics.sh--zookeeper localhost:2181--alter--topic TEST2--partitions 2
2. Consumer News:
bin/kafka-console-consumer.sh--zookeeper localhost:2181--topic test--from-beginning
3. Production message:
bin/kafka-console-producer.sh--broker-list localhost:9092--topic test
This is a message
This is another message
Press CTRL + C to end (^C)
Consumer_group
1. See what consumer groups
./kafka-consumer-groups.sh--bootstrap-server 172.16.1.170:9092,172.16.1.171:9092,172.16.172:9092--list-- New-consumer
2. View the consumption of the specified consumer groups (you can see the offset of topic)
./kafka-consumer-groups.sh--bootstrap-server 172.16.1.170:9092,172.16.1.171:9092,172.16.172:9092--describe-- Group pushconsumer_qaba7b--new-consumer
1 2 3 4 5 6 7 8 9 |
group, topic, partition, current of Fset, log end offset, lag, owner ztest-group, ztest2, 6, 4987, 4987, 0, consumer-7_/172.19.15.113 ztest-group, ztest2, 0, 4876, 4936, 60, consumer-1_/172.19.15.113 ztest-group, ztest2, 3, 5008, 5062, 54, consumer-4_/172.19.15.113 ztest-group, ztest2, 4, 4963, 4992, 29, consumer-5_/ 172.19.15.113 ztest-group, ztest2, 1, 4900, 4949, 49, consumer-2_/172.19.15.113 ztest-group, ztest2, 2, 5046, 5046, 0, consumer-3_/172.19.15.113 Ztest-group, ztest2, 7, 5051, 5051, 0, consumer-8_/172.19.15.113 ztest-group, ZTEST2, 5, 5010, 5010, 0, consumer-6_/172.19.15.113 |
Refer to the official documentation as follows: managing Consumer Groups
With the Consumergroupcommand tool, we can list, delete, or describe consumer groups. For example, the to list all consumer groups across all topics:
> bin/kafka-consumer-groups.sh--zookeeper localhost:2181--list
test-consumer-group
To view offsets as in the previous example and the Consumeroffsetchecker, we "describe" the consumer group like this:
> bin/kafka-consumer-groups.sh --zookeeper localhost:2181 --describe --group test-consumer-group group TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG owner test-consumer-group test-foo 0 1 3 2 test-consumer-group_ postamac.local-1456198719410-29ccd54f-0
When you ' re using the new consumer API