Principle and practice of distributed high performance message system (Kafka MQ)

Source: Internet
Author: User
Tags message queue zookeeper

I. Some concepts and understandings about Kafka


Kafka is a distributed data flow platform that provides high-performance messaging system functionality based on a unique log file format. It can also be used for large data stream pipelines.


Kafka maintains a directory-based message feed, called Topic.

The project called the release of the message to topic was a producer.

The project for a subscription source called subscription topic and processing a published message is for the consumer.

Kafka runs as a cluster of one or more servers, each of which is called a broker.


The Kafka client and server are connected through the TCP protocol and are provided with Java clients, and many other languages have clients.


For each Topic,kafka cluster maintains a partitioned log file (partition 1, partition 2, Partition 3), each partition (partition) is sequential, immutable, and continuously appends the message queue to the back, called the commit log, Each message in it has a serial number called Offset, which uniquely identifies each message in the partition.


The Kafka cluster holds all published messages, regardless of whether they are consumed or not, and the duration of the save is configurable. Kafka has a constant amount of data for performance, so it has no problem dealing with large amounts of data.


Messaging systems typically have two models: queuing and broadcast, and queuing is a way for many consumers to compete for data at the same time, but a single piece of data is distributed to a single consumer, broadcast mode is a message broadcast to all consumers, and each consumer can get the message. Kafka the two patterns through consumer group unification.


Consumers have given themselves a label for the group name (ID), and each message posted to topic will be sent to one and only one member of the consumer group for each subscription. Consumers can be distributed on different processes or servers.



Relationship of message, partition, and consumer

1, the message according to a certain hash logic distribution to topic of a partition;

2, a consumer can connect multiple partition;

3, all partition will have consumer thread to connect, this consumer allocation is automatic, unable to specify a consumer connection which partition;

4, consumer connected partitions is fixed, will not be automatically changed midway, such as Consumer1 connection is partition1 and Partition3,consumer2 connection is partition2, this allocation will not change in the middle.

5, consumer if more than partition number, then the extra part of the consumer will not even partition and idle.



Kafka Server Common script commands


Start Kafka:

bin/kafka-server-start.sh Config/server.properties &


Stop Kafka:

bin/kafka-server-stop.sh


1. Topic operation

Create topic:

bin/kafka-topics.sh--create--zookeeper localhost:2181--replication-factor 3--partitions 1--topic TEST2

Delete topic:

bin/kafka-topics.sh--delete--zookeeper localhost:2181--topic topicname

View all topic:

bin/kafka-topics.sh--list--zookeeper localhost:2181

To view a topic details:

bin/kafka-topics.sh--describe--zookeeper localhost:2181--topic topic_name

Modify Topic:

bin/kafka-topics.sh--zookeeper localhost:2181--alter--topic TEST2--partitions 2


2. Consumer News:

bin/kafka-console-consumer.sh--zookeeper localhost:2181--topic test--from-beginning


3. Production message:

bin/kafka-console-producer.sh--broker-list localhost:9092--topic test

This is a message

This is another message

Press CTRL + C to end (^C)


Consumer_group

1. See what consumer groups

./kafka-consumer-groups.sh--bootstrap-server 172.16.1.170:9092,172.16.1.171:9092,172.16.172:9092--list-- New-consumer

2. View the consumption of the specified consumer groups (you can see the offset of topic)

./kafka-consumer-groups.sh--bootstrap-server 172.16.1.170:9092,172.16.1.171:9092,172.16.172:9092--describe-- Group pushconsumer_qaba7b--new-consumer

1 2 3 4 5 6 7 8 9 group, topic, partition, current of Fset, log end offset, lag, owner ztest-group, ztest2, 6, 4987,  4987, 0, consumer-7_/172.19.15.113 ztest-group, ztest2, 0, 4876, 4936, 60,  consumer-1_/172.19.15.113 ztest-group, ztest2, 3, 5008, 5062, 54,  consumer-4_/172.19.15.113 ztest-group, ztest2, 4, 4963, 4992, 29, consumer-5_/ 172.19.15.113 ztest-group, ztest2, 1, 4900, 4949, 49, consumer-2_/172.19.15.113 ztest-group, ztest2, 2, 5046, 5046, 0, consumer-3_/172.19.15.113 Ztest-group,  ztest2, 7, 5051, 5051, 0, consumer-8_/172.19.15.113 ztest-group, ZTEST2,  5, 5010, 5010, 0, consumer-6_/172.19.15.113


Refer to the official documentation as follows: managing Consumer Groups

With the Consumergroupcommand tool, we can list, delete, or describe consumer groups. For example, the to list all consumer groups across all topics:

> bin/kafka-consumer-groups.sh--zookeeper localhost:2181--list

test-consumer-group

To view offsets as in the previous example and the Consumeroffsetchecker, we "describe" the consumer group like this:

 > bin/kafka-consumer-groups.sh --zookeeper localhost:2181 --describe --group  test-consumer-group group                           TOPIC                            PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG              owner test-consumer-group             test-foo                        0           1                3               2                test-consumer-group_ postamac.local-1456198719410-29ccd54f-0

When you ' re using the new consumer API

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.