Why is Kafka (ii)

Last Update:2015-06-12 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Answer a few netizens raised the question, not clear can look at a piece of content.

1. How should the deletion policy of Kafka be configured? To improve performance, should I delete the consumed data for 1 hours?

Can be configured according to the size of the disk, as long as the disk is sufficient, completely unnecessary to remove the worry. The throughput of the Kafka is not reduced by the increase in data volume. Because theKafka is completely sequential when reading and writing data, only the offsetis recorded, the time complexity is O(1), I have tested the data on T , completely unaffected. Instead, the data is deleted too quickly and is prone to data loss.

2. The message sent has failed, reaching the specified number of retries how to handle it?

The client can set the number of retries and the retry interval, because the general Kafka is in the form of a cluster, it is not successful to retry all the time, and the common case is that the application and the Kafka cluster are disconnected. In fact, in the process of retrying, if the application hangs, the message is lost, if you want to avoid this situation, you need to persist the message, of course, you can choose to persist and remote persistence, choose local persistence is not very safe, because the application server is now likely to be a virtual machine or a container, Remote persistence is relatively secure. But remote means you need a network, what if it happens that remote persistence fails? To solve this kind of problem, the last lifeline is the diary. This type of problem is not just in MQ , but also in storage, which is common in distributed scenarios, but is often overlooked by developers because of the small probability of happening. This is the reason that the settlement can never be accounted for flat. It is often worthwhile to weigh the handling of such small probability events. Important systems usually have the function of timing check. As a compensation mechanism for small probability events.

3. if the total number of replicas is F, how many copies are allowed to be lost?

A maximum of f-1 copies can be lost, that is, as long as there is a copy. This is, of course, about the broker 's configuration. From the point of view of the server, how to distribute the updated data to the whole system as soon as possible, and reduce the time window to achieve the final consistency, is an important aspect to improve the usability and user experience of the system. For distributed Data Systems:

a) number of copies of n -Data

b) w -Update data is the number of nodes that need to guarantee write completion

c) the number of nodes that need to be read when R -read data

Any distributed system, on the server side, to maintain strong consistency, must conform to w+r>n, that is, assuming that there are altogether 3 nodes, write data when the three nodes are written successfully to return, as long as there is a node to survive, you can ensure that the data is up to date.

4. are Kafka in order?

In the same partition is completely sequential, the producer can set the partitioning policy and can customize the partitioning policy so that it can be based on the business partition. For example, if it is related to the user, it can be partitioned according to the user ID , and all operations of the same user go to the same partition, and the order is reached.

Of course, there is the order is also harmful, there is order means blocking, if the consumption of a message has been failed, the consumption process will be blocked, flexible processing method is to retry to a certain number of times, the message persisted to the remote, skip the message continues to consume. It means losing the order.

Why is Kafka (ii)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Why is Kafka (ii)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Why is Kafka (ii)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support