What is the use of Kafka.
For example, there are the following scenarios:
1. There is currently an interface that provides a draft comment to create a user
2. After checking the draft, if it is legal, push it to the line and find the forbidden Word deleted.
3. If delete, record user's illegal number +1
4. Push to the line post comment number +1, and inform the author of the article to harvest a comment
5 .....
The follow-up to these operations can be extended indefinitely, and it is too lengthy to write in a "create comment". From an event-driven point of view, it is as if a drop of water drops into the lake, triggering constant waves. Then is there a way to focus the operation on its own modifications, and the subsequent operations are discovered and started by themselves. The answer is to use Message Queuing .
Kafka is such a message queue. Using Kafka we can change the above scenario to:
1. After creating a draft comment, send a comment_create message
2. A comment_create message is found, the consumer checks whether the comment is legal, and chooses to delete or pass the comment. If by sending a comment_created, do not send a comment_illegal by sending a line
3. Found a comment_created message, the consumer to the article author pushes, will article comments number +1
4. A comment_illegal message was found and the consumer was placed in an illegal
You can see that the above operations are like a wave, extending from 1 to 2, then to 3 or 4, and each of these steps is asynchronous. what are producers and consumers
Producer producers produce messages and are consumed by designated consumer consumers. What is topic
Kafka each message needs to specify a topic, each producer sends a message that needs to specify a topic as the target. Consumer is also the cancellation from the corresponding topic (in fact, more than that). What is grouping (group)
Each consumer belongs to a consumer group, and each group can have more than one consumer. Each group can subscribe to a topic, and messages sent to topic are sent to each group of this topic.
Each time the group receives a message, it randomly chooses one of its own consumer to consume. The schematic is as follows:
what is partition (partition)
To begin with, there are some deep concepts: each topic will be divided into multiple partition (zones), and a message will be stored in a partition according to the algorithm specified by producer, which means storing the data separately . Avoid large amounts of data stored on a single Kafka instance. In addition, the more partitions means that more consumer can be accommodated, effectively increasing the capacity of concurrent consumption (see below for details)
Any message posted to a partition is appended directly to the end of its log file, where each message is called offset (offset) in the file, and offset is a long integer that uniquely marks a message. what is replica (backup)
Kafka can specify the number of backups per partition, so that if a machine collapses, the machine that backs up the data can take effect immediately.
If each partition has multiple backups, choose which one to read and write. This involves the concept of leader, in the Kafka will be selected partition an instance as leader, all other as follower.
Leader is responsible for all read and write, and if it fails, then there will be other follower to take over for the new leader. Under normal circumstances, all follower are only monotonic to leader request the latest data (synchronous data).
So as Leader? The server carries all the request pressure, so from the overall consideration of the cluster, the more partitions means that there are more "Leader", which will ensure the stability of the whole.