Objective
The latest project to use the message queue to do the message transmission, the reason why choose Kafka is because to cooperate with other Java projects, so the Kafka know a bit, is also a note it.
This article does not talk about the differences between Kafka and other message queues, including performance and how it is used.
Brief introduction
Kafka is a service that implements distributed, partitioned, and replicated logs. It provides the functionality of the message system middleware through a unique set of designs. It is a messaging system that publishes Subscription functionality.
Some nouns
If you want to use Kafka, there are some nouns in Kafka that need to be known that the text does not discuss whether these nouns have the same meaning in other message queues. All nouns are aimed at Kafka.
Message
The message, which is the content to be sent, is generally packaged as a message object.
Topic
In layman's terms, it is the place where "messages" are placed, that is, a container for message delivery. If you think of the message as an envelope, then Topic is a mailbox, as shown in:
Partition && Log
Partition partition, can be understood as a logical partition, like our computer's disk C:, D:, E: Disk,
KAFKA maintains a journal log file for each partition.
Each partition is an ordered, non-modifiable, message-composed queue. When the message comes in, it is appended to the log file, which is executed according to the commit command.
Each message in the partition has a number, called the offset ID, which is unique in the current partition and is incremented.
Log, is used to record the message received in the partition, because each Topic can send a message to one or more partitions at the same time, so when the log is actually stored, each partition will correspond to a log directory, the naming rules are generally <topic_name>-<partition_id>
, the directory is a partition of a commit log Log files.
The Kafka cluster will hold all the published information for a period of time, whether or not the message has been consumed, and this time period is configurable. For example, when the log save time is set to 2 days, the messages released within 2 days can be consumed, and the previous messages will be discarded in order to free up space. The performance of Kafka is irrelevant to the amount of data, so saving large amounts of message data does not cause performance problems.
Partitioning the log is mainly for the following purposes: First, this allows the ability to scale log more than a single server on-line, each independent partition size is limited to the volume of a single server, But a topic can have a lot of partition which makes it capable of handling data of any size. Second, in parallel processing this can be used as a separate unit.
Producer Producers
As with other message queues, producers are usually the source of the message.
In Kafka it determines which partition the message is sent to the specified topic.
Consumer consumers
Consumers are the users of the message, and there are several nouns that need to be differentiated at the consumer end.
General Message Queuing has two modes of consumption, namely, queue mode and subscription mode .
Queue mode: One-to-one, is a message can only be consumed by a consumer, can not repeat consumption. The general queue supports multiple consumers, but for a message, only one consumer can consume it.
Subscription mode: One-to-many, a message may be consumed multiple times, the message producer will publish the message to topic, as long as the subscription to change topic consumers can consume.
Consumer && Subscriber
Group : groups, is a collection of consumers, each group has one or more consumers, Kafka within a group, messages can only be consumed once.
In the publish subscription model, consumers subscribe in groups, that is, consumer group, whose relationships are as follows:
Each message posted to topic will be posted to one of the consumers in each of the consumer groups subscribed to this topic, i.e. each group will be posted, but each group will only have one consumer to consume the message.
At the beginning, Kafka is the message queue for the publish-subscribe feature, so in Kafka, the queue pattern is implemented through a single consumer group, that is, there is only one consumer group in the entire structure, and the load balance between consumers.
Kafka Cluster
Borker: The Kafka cluster consists of multiple servers, each of which is called a Broker. Messages of the same topic are partitioned on different brokers according to a certain key and algorithm.
引用自:http://blog.csdn.net/lizhitao
Because the Kafka cluster is implemented by distributing partitions to individual servers, which means that each server in the cluster is sharing data and requests to each other, each partition's log files are copied to a specified fraction, dispersed across the cluster machines, to achieve failover.
For each partition there will be a server as its "leader" and there are 0 or more servers as "followers". The leader server handles all read and write requests for this partition, and the followers server replicates the leader server passively. If a leader server fails, then the followers server will have one automatically elected to become the new leader. Each server as a leader of some partition also acts as the follower of other servers, thus realizing the load balance of the cluster.
. NET Core Kafka Clients
In. NET Core, there is the corresponding open Source Kafka SDK project, which is Rdkafka. It supports. NET 4.5 at the same time, and supports cross-platform, which can run on Linux,macos and Windows.
Rdkafka github:https://github.com/ah-/rdkafka-dotnet
Rdkafka Nuget:Install-Package RdKafka
Producer API
// Producer 接受一个或多个 BrokerListusing (Producer producer = new Producer("127.0.0.1:9092"))//发送到一个名为 testtopic 的Topic,如果没有就会创建一个using (Topic topic = producer.Topic("testtopic")) { //将message转为一个 byte[] byte[] data = Encoding.UTF8.GetBytes("Hello RdKafka"); DeliveryReport deliveryReport = await topic.Produce(data); Console.WriteLine($"发送到分区:{deliveryReport.Partition}, Offset 为: {deliveryReport.Offset}");}
Consumer API
Since Kafka is consumed in the form of a consumer group, it is necessary to specify a groupid.
In the internal implementation, the consumer is through a polling mechanism to implement the monitoring of the Topic message, which is also the Kafka recommended way, in the Rdkafka polling interval of 1 seconds.
Configure consumer groupsvar config =New Config () {GroupId = "Example-csharp-consumer"}; using (var consumer = new Eventconsumer (config, "127.0.0.1:9092")) {//register an event consumer. OnMessage + = (obj, msg) = {string text = Encoding.UTF8.GetString (msg. Payload, 0, MSG. Payload.length); Console.WriteLine ($ "Topic: {msg. Topic} Partition: {msg. Partition} Offset: {msg. Offset} {text} "); }; //subscribe to one or more topic consumer. Subscribe (new[] { "Testtopic"}); //start consumer. Start (); Console.WriteLine ( "Started consumer, press ENTER to stop consuming"); Console.ReadLine ();}
Basic knowledge of Message Queuing Kafka and. NET Core clients