What is the problem that kafka may lose messages?

Source: Internet
Author: User
Dear friends, I have recently studied kafka and read a lot that kafka may lose messages. I really don't know what scenarios A log system can tolerate the loss of messages. For example, if a real-time log analysis system is used, the log information I see may be incomplete... dear friends, I have recently studied kafka and read a lot that kafka may lose messages.

I really don't know what scenarios A log system can tolerate the loss of messages.

For example, if a real-time log analysis system is used, the log information I see may be incomplete. If abnormal logs are not displayed, the problem may be located?

We can also see that the crash of a node in the distributed cluster kafka may also lead to the loss of messages on this node (as mentioned in the comparison between kafka and rabbitMQ, rabbitMQ does not have this problem ).

If kafka is so unreliable, why are so many companies using kafka?

Reply content:

Dear friends, I have recently studied kafka and read a lot that kafka may lose messages.

I really don't know what scenarios A log system can tolerate the loss of messages.

For example, if a real-time log analysis system is used, the log information I see may be incomplete. If abnormal logs are not displayed, the problem may be located?

We can also see that the crash of a node in the distributed cluster kafka may also lead to the loss of messages on this node (as mentioned in the comparison between kafka and rabbitMQ, rabbitMQ does not have this problem ).

If kafka is so unreliable, why are so many companies using kafka?

I don't know how rabbitMQ works.

Messages are lost in kafka mainly in two links.

  1. Message disk storage time

Messages are asynchronously refreshed and synchronously refreshed on disks, which significantly increases the reliability of asynchronous refresh. However, in some scenarios, performance is pursued, and reliability is ignored. This can be enabled.

  1. Message storage maintenance

This is not a reference to persistent storage. Oracle/MySQL has been storing data for so long, and the disaster recovery tools among them are all very complete and form a system (if something goes wrong, you can find people and solve the problem) who knows about kafka storage ~ Very few tools!

In addition, it is the storage medium for disks. If raid is not performed, a single disk may be damaged. If raid is performed, the cost will increase. If you copy multiple sets, there will be instantaneous data inconsistency caused by network synchronization latency.

Conclusion: kafka requires that you do not lose data at all (in the case of non-major disaster tolerance, for example, the data center is bombed by an atomic bomb, or the raid is mistakenly mistaken for the synchronization time or low level ), yes. The cost is to lose some performance.

Therefore, kafka is generally used in scenarios where a small amount of data is allowed to be lost but the overall throughput is very large (such as log collection ), statistical analysis of data (but a few hundred pieces of data does not affect the sample space of hundreds of millions ).

Kafka can also be used for data synchronization between two reliable storages, such as MySQL (write)-> MySQL (degree), because MySQL (write) ensures that data can be replayed, therefore, the recovery speed and reliability can be ensured when kafka fails.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.