kafka different versions

Source: Internet
Author: User
Keywords Cloud computing kafka use kafka tutorials kafka basics kafka version differences

kafka different versions.

kafka-0.8.2 new features

The producer no longer distinguishes between sync and async, and all requests are sent asynchronously, improving client efficiency. The producer request will return a response object, including the offset or error message. This asynchronously bulk sends messages to the kafka broker node, which can reduce the overhead of server-side resources. The new producer and all server network communications are asynchronous, waiting for the replicas of all replicas to replicate at ack = -1 mode, drastically reducing latency.

Prior to 0.8.2, there was a bug in kafka's deletion of topic features.

Prior to 0.8.2, comsumer periodically submits the offset location of kafka messages that have been consumed to zookeeper for saving. For zookeeper, the cost of each write operation is very expensive, and the zookeeper cluster can not extend write capabilities. Beginning with 0.8.2, you can record the offset submitted by comsumer in the compacted topic (__comsumer_offsets), which sets the highest level of persistence assurance, that is, ack = -1. __consumer_offsets consists of a key and offset value consisting of a triple <comsumer group, topic, partiotion> and maintains a new view view in memory, so reads quickly.

kafka can do checkpoint checkpoint frequently on an offset, even if it is offset once for every message it consumes.

In 0.8.1, this functionality has been experimentally added and can be used extensively in 0.8.2. The function of auto rebalancing mainly resolves the uneven distribution of the leader partition on the broker node after the broker node is restarted. For example, the traffic of some node NICs is too high and the load is much higher than that of other nodes. auto rebalancing The main configuration is as follows,

controlled.shutdown.enable Whether to automatically migrate the leader partition when the broker is shut down. The basic idea is that each time kafka receives a request to shut down a broker process, it actively migrates the leader partition to its surviving node, that is, follow replica is promoted to the new leader partition. If this parameter is not enabled, the cluster waits until the replica session times out before the controller node will re-select the new leader partition, which is not readable or writable by the leader partitions during this time. If the cluster is very large or a lot of partitions, partition will not be available for a long time.

1) You can turn off the unclean leader election, that is, the replica that is not in the ISR (IN-Sync Replica) list and will not be promoted to the new leader partition. unclean.leader.election = false, kafka cluster persistence than availability, if there are no other replicas in the ISR, will cause the partition can not read and write.

2) Set min.isr (default 1) and producer use ack = -1 to improve persistence of data writes. When the producer sets ack = -1, if the broker finds that the number of replicas in the ISR is less than the value of min.isr, the broker will reject the write request from the producer. max.connections.per.ip Limit the number of connections initiated by each client ip, to avoid broker node file handle is consumed.

kafka-0.9 new features

First, the security features

Prior to 0.9, Kafka's security considerations were almost zero and had to be configured via Linux firewalls or other network security aspects for extranet transport. I believe this, a lot of users are concerned about the use of Kafka for external Internet message interaction.

Client Connections borker uses SSL or SASL for authentication

borker connect ZooKeeper for rights management

Data transmission is encrypted (need to consider the performance impact)

Client read and write operations can be authorized to manage

External pluggable modules can be authorized to manage

Of course, the security configuration is optional and can be mixed. Such as: security configuration of the borkers and borkers without security configuration on the same cluster, authorized clients and unauthorized clients, can also be in the same cluster and so on. See the official configuration of the specific document.

Second, Kafka Connect

This function module, but also the previous version does not have. Can be seen from the name, it can and external systems, data sets to establish a data stream connection, data input and output. Has the following features:

Using a common framework, we can develop and manage the Kafka Connect interface

Support distributed mode or stand-alone mode to run

Support REST interface, can be submitted through the REST API, management Kafka Connect cluster

Automatic management

In the meantime, the official documents also give examples. Through the configuration, enter data into a text file, the data can be transmitted to the topic in real time. In the data stream or bulk transmission, is an optional solution.

Third, the new Comsumer API

Instead of a high-level, low-level split, the new Comsumer API maintains its own offset. The benefits of doing so is to avoid the application of an exception occurs, the data is not consumed successfully, but Position has been submitted, resulting in the message is not consumed. By looking at the API, the new Comsumer API has the following features:

Kafka can maintain its own Offset, Consumer's Position. It is also possible for developers to maintain Offset themselves to meet the business requirements.

Spending, you can consume only the specified Partitions

You can use an external storage record Offset, such as the database.

Consumer control Consumer Consumer News location.

You can use multi-threaded for consumption

kafka-0.10 new features

Kafka has built-in rack-aware so that copies are isolated, which allows Kafka to ensure that replicas can span multiple racks or areas of availability, dramatically increasing Kafka's resiliency and availability. This feature is provided by Netflix

All Kafka messages contain a timestamp field, which is the time this message was generated. This allows Kafka Streams to handle event-based stream processing; and those that look for messages by time and those based on event timestamps can be garbage collected.

Apache Kafka 0.9.0.0 introduces new security features, including support for Kerberos over SASL. Apache Kafka 0.10.0.0 now supports more SASL features, including external authorization servers, multiple types of SASL authentication on one server, and other enhancements.

Kafka Connect has been continuously improving. Before that, users needed to monitor the logs to see the status of individual connectors and their tasks, and Kafka now has support for getting the state APIs that make monitoring easier. Control-related APIs have also been added, which allows users to stop a connector during maintenance or manually restart failed tasks. These can be visually displayed and managed in the user interface connector is currently visible in the Control Center.

Kafka Consumer Max Records, at Kafka 0.9.0.0, developers have almost no control over the number of returned messages when using poll () on new consumers. However, it's good news that this release of Kafka introduces the max.poll.records parameter, which allows developers to control the number of messages returned.

Improved protocol version, Kafka brokers now supports the return of all supported protocol version of the request API, the benefits of this feature is the future will allow a client to support multiple broker version.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.