Kafka 0.9+zookeeper3.4.6 Cluster Setup, configuration, new Java Client Usage Essentials, high availability testing, and various pits (ii)

Last Update:2016-04-01 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In the previous section (Point this transfer), we completed the Kafka cluster, in this section we will introduce the new API in version 0.9, and the test of Kafka cluster high availability

1. Use Kafka's producer API to complete the push of messages

1) Kafka 0.9.0.1 Java Client dependency:

<dependency> <groupId>org.apache.kafka</groupId> <artifactid>kafka-clients</ Artifactid> <version>0.9.0.1</version></dependency>

2) Write a Kafkautil tool class to construct the Kafka Client

public class Kafkautil {private static kafkaproducer<string, string> kp;public static kafkaproducer<string, String> Getproducer () {if (KP = = null) {Properties props = new Properties ();p rops.put ("Bootstrap.servers", "10.0.0.100 : 9092,10.0.0.101:9092 ");p rops.put (" ACKs "," 1 ");p rops.put (" retries ", 0);p rops.put (" Batch.size ", 16384);p rops.put (" Key.serializer "," Org.apache.kafka.common.serialization.StringSerializer ");p rops.put (" Value.serializer "," Org.apache.kafka.common.serialization.StringSerializer "); kp = new kafkaproducer<string, string> (props);} return KP;}}

The kafkaproducer<k,v> K represents the key type for each message, and V represents the message type. The key of the message is used to determine which partition the message is received from, so we need to ensure that the key for each message is different.

Common configuration of the producer side

Bootstrap.servers:Kafka Cluster Connection string, can be composed of multiple host:port
There are three types of Acks:broker message recognition modes:
0: Do not receive acknowledgement of the message, that is, client sends completion will not wait for the broker's confirmation
1: Confirmed by leader, leader will return the confirmation message immediately after receiving the message.
All: Complete confirmation of the cluster, leader will wait for all In-sync follower nodes to confirm receipt of the message, and then return the confirmation message
We can set different confirmation modes based on the importance of the message. Default is 1
Retries: The retry count on the producer side when sending fails, default is 0
Batch.size: When a large number of messages are sent to the same partition at the same time, the producer side packages the message and sends it in bulk. If set to 0, each message is sent independently. Default is 16384 bytes
Linger.ms: The number of milliseconds to wait before sending a message, used in conjunction with Batch.size. In the case of low message load, configuration linger.ms can allow producer to wait for a certain amount of time before sending the message, to accumulate more message packaging and send, to save network resources. Default is 0
Key.serializer/value.serializer: The Sequencer class of the message Key/value, determined by the type of key and value
Buffer.memory: The message buffer pool size. Messages that have not yet been sent are saved in producer memory, and if the message is generated more quickly than the message is sent, then the request to send the message after the buffer pool is full is blocked. Default 33554432 bytes (32MB)

More producer configuration See official website: http://kafka.apache.org/documentation.html#producerconfigs

3) write a simple producer end that sends a message to the Kafka cluster every 1 seconds:

public class Kafkatest {public static void main (string[] args) throws exception{producer<string, string> Producer = Kafkautil.getproducer (); int i = 0;while (true) {producerrecord<string, string> record = new producerrecord< String, string> ("Test", string.valueof (i), "This is a message" +i);p Roducer.send (record, new Callback () {public void Oncompletion (Recordmetadata metadata, Exception e) {if (E! = null) e.printstacktrace (); SYSTEM.OUT.PRINTLN ("Message Send to Partition" + metadata.partition () + ", offset:" + metadata.offset ());}); i++; Thread.Sleep (1000);}}

When calling Kafkaproducer's Send method, you can register a callback method that triggers the callback logic when the producer end is sent, and in the metadata object of the callback method, we can get information about the offset of the sent message and the partition where it fell. Note that if the ACKs is configured to 0, the callback logic is still triggered, except that the partition information for offset and message landing is not taken.

Run, the output is like this:

Message send to partition 0, offset:28
Message send to partition 1, offset:26
Message send to partition 0, offset:29
Message send to partition 1, offset:27
Message send to partition 1, offset:28
Message send to partition 0, offset:30
Message send to partition 0, offset:31
Message send to partition 1, offset:29
Message send to partition 1, offset:30
Message send to partition 1, offset:31
Message send to partition 0, offset:32
Message send to partition 0, offset:33
Message send to partition 0, offset:34
Message send to partition 1, offset:32

At first glance it seems that offset has been messed up, but this is because the messages are distributed on two partitions, and the offset on each partition is actually incremented.

4) write consumer end to consume messages

First transform the Kafkautil class and join the consumer client construct.

public class kafkautil {private static kafkaproducer<string, string>  kp;private static kafkaconsumer<string, string> kc;public static  Kafkaproducer<string, string> getproducer ()  {if  (kp == null)  { Properties props = new properties ();p rops.put ("Bootstrap.servers",  " 10.0.0.100:9092,10.0.0.101:9092 ");p rops.put (" ACKs ", " 1 ");p rops.put (" retries ",  0);p rops.put (" Batch.size ",  16384);p rops.put (" Key.serializer ", " Org.apache.kafka.common.serialization.StringSerializer ");p rops.put (" Value.serializer ", " Org.apache.kafka.common.serialization.StringSerializer "); Kp = new kafkaproducer<string,  String> (props);} RETURN&NBSP;KP;} Public static kafkaconsumer<string, string> getconsumer ()  {if (kc ==  NULL)  {properties props = new properties ();p rops.put ("BootsTrap.servers ", " 10.0.0.100:9092,10.0.0.101:9092 ");p rops.put (" Group.id ", " 1 ");p rops.put (" Enable.auto.commit ", " true ");p rops.put (" auto.commit.interval.ms ", " $ ");p Rops.put (" session.timeout.ms ", " 30000 ");p rops.put (" Key.deserializer ", " Org.apache.kafka.common.serialization.StringDeserializer ");p rops.put (" Value.deserializer ", " Org.apache.kafka.common.serialization.StringDeserializer "); Kc = new kafkaconsumer<string,  String> (props);} RETURN&NBSP;KC;}}

Again, let's introduce consumer common configuration

Bootstrap.servers/key.deserializer/value.deserializer: Like the meaning of the producer end, do not repeat
Fetch.min.bytes: The message size (byte) for each minimum pull. Consumer will wait for the message to accumulate to a certain size after the bulk pull. The default is 1, which means that one bar pulls one
Max.partition.fetch.bytes: Maximum message size (byte) for each pull from a single partition, default is 1M
Group.id:Consumer group ID, multiple consumer under the same group do not pull to duplicate messages, and consumer under different group will ensure that each message is pulled. Note that the number of consumer under the same group cannot exceed the number of partitions.
Enable.auto.commit: Whether to automatically submit an offset that has been pulled off. Submitting offset is considered a successful consumption of the message, and the consumer under the group cannot be pulled back to the message (unless offset is manually modified). Default is True
Auto.commit.interval.ms: The number of milliseconds between auto-commit offset, default 5000.

See the official documentation for all consumer configurations: Http://kafka.apache.org/documentation.html#newconsumerconfigs

Next write the consumer side:

public class Kafkatest {public static void main (string[] args) throws Exception{kafkaconsumer<string, String> Consu Mer = Kafkautil.getconsumer (); Consumer.subscribe (arrays.aslist ("Test")); while (true) {consumerrecords<string, String> records = Consumer.poll (+); for (consumerrecord<string, string> record:records) { System.out.println ("fetched from Partition" + record.partition () + ", offset:" + record.offset () + ", message:" + Recor D.value ());}}}

Run output:

Fetched from partition 0, offset:28, Message:this is MESSAGE0
Fetched from partition 0, offset:29, Message:this is Message2
Fetched from partition 0, offset:30, Message:this is Message5
Fetched from partition 0, offset:31, Message:this is Message6
Fetched from partition 0, offset:32, Message:this is Message10
Fetched from partition 0, offset:33, Message:this is Message11
Fetched from partition 0, offset:34, Message:this is Message12
fetched from partition 1, offset:26, Message:this is Message1
fetched from partition 1, offset:27, Message:this is Message3
fetched from partition 1, offset:28, Message:this is Message4
fetched from partition 1, offset:29, Message:this is Message7
fetched from partition 1, offset:30, Message:this is Message8
fetched from partition 1, offset:31, Message:this is Message9
fetched from partition 1, offset:32, Message:this is Message13

Description

The poll method of Kafkaconsumer is to pull the message from the broker and first subscribe to a topic with subscribe method before poll.
The poll method's entry is the number of pull timeout milliseconds, and if no new message is available for pull, consumer waits for the specified number of milliseconds and returns an empty result set after the time-out.
If topic has multiple partition,kafkaconsumer, load balancing is polled across multiple partition. If multiple consumer threads are started, Kafka can also implement multiple consumer schedules through zookeeper, guaranteeing that consumer in the same group does not repeat consumption messages. Note that the number of consumer cannot exceed partition, and the excess consumer cannot be pulled to any data.
It can be seen that the message pulled is not completely sequential, Kafka can only guarantee a partition inside the message FIFO, so in the case of cross-partition, the order of the message is not guaranteed.
In this case, the auto-commit Offset,kafka client initiates a thread that periodically submits offset to the broker. Assuming that a failure occurs within the auto-commit interval (for example, the entire JVM process is dead), then some of the messages will be consumed repeatedly. To avoid this problem, you can use the method of manually submitting offset. When constructing consumer, Enable.auto.commit is set to False and is manually submitted in code with Consumer.commitsync ().

If you do not want Kafka to control the load balance between partition when consumer pull data, you can also manually control:

public static void main (String[] args)  throws exception{ Kafkaconsumer<string, string> consumer = kafkautil.getconsumer ();     String topic =  "Test";    topicpartition partition0 =  New topicpartition (topic, 0);     topicpartition partition1 = new  topicpartition (topic, 1);     consumer.assign (Arrays.aslist (partition0,  Partition1)); while (true)  {ConsumerRecords<String, String> records =  Consumer.poll (+); for (consumerrecord<string, string> record : records)  { System.out.println ("fetched from partition "  + record.partition ()  +  ",  offset:  " + record.offset ()  + ", message:  " + record.value ( ));} Consumer.commitsync ();}}

Use the Consumer.assign () method to specify 1 or more partition for a consumer thread.

Here's The pit:

In the test, I found that if you pull the message manually by specifying the partition method, I do not know why the Kafka auto-commit offset mechanism will fail, and you must use manual methods to correctly commit the consumed message offset.

Off Topic:

In a real application environment, the consumer end of the message pull down to do is certainly not only the output is so simple, in the consumption of messages, it is likely to spend more time. 1 consumer threads consume messages at a rate that is likely to catch up with producer, so we have to consider a multithreaded model of consumer to consume messages.
However, Kafkaconsumer is not thread-safe, multiple threads operating the same Kafkaconsumer instance can have various problems, Kafka the official instructions for consumer-side multithreading are as follows:

1. Each thread holds a Kafkaconsumer object
Benefits:

Simple to implement
Maximum efficiency without the need for inter-threading collaboration
Easiest to implement sequential processing of messages within each partition

Disadvantages:

Each kafkaconsumer maintains a TCP connection to the cluster
The number of threads cannot exceed partition
The amount of data pulled by each batch becomes smaller and has a certain impact on throughput

2. Decoupling, 1 consumer threads are responsible for pulling messages, and several worker threads are responsible for consuming messages
Benefits:

Free to control the number of worker threads, not limited by the number of partition

Disadvantages:

The order of message consumption cannot be guaranteed
Difficulty controlling the time to manually submit offset

Personally think that the second way is more desirable, consumer number can not exceed partition number This limit is very deadly, it is impossible to improve the efficiency of consumer consumer news to divide the topic into more partition,partition more, The lower the high availability of the cluster.

2. Kafka Cluster High Availability test

1) View the status of the current topic:

/kafka-topics.sh--describe--zookeeper 10.0.0.100:2181,10.0.0.101:2181,10.0.0.102:2181--topic test

Output:

Topic:test partitioncount:2 replicationfactor:2 configs:
Topic:test partition:0 leader:1 replicas:1,0 isr:0,1
Topic:test partition:1 leader:0 replicas:0,1 isr:0,1

As you can see, the leader of Partition0 is Broker1,parition1 leader is Broker0

2) Start producer Send message to Kafka cluster

Output:

Message send to partition 0, offset:35
Message send to partition 1, offset:33
Message send to partition 0, offset:36
Message send to partition 1, offset:34
Message send to partition 1, offset:35
Message send to partition 0, offset:37
Message send to partition 0, offset:38
Message send to partition 1, offset:36
Message send to partition 1, offset:37

3) login SSH will Broker0, that is partition 1 of leader kill off

Check the topic status again:

Topic:test partitioncount:2 replicationfactor:2 configs:
Topic:test partition:0 leader:1 replicas:1,0 isr:1
Topic:test partition:1 leader:1 replicas:0,1 isr:1

As you can see, the current Parition0 and Parition1 leader are broker1.

At this point again to see the output of producer:

Message send to partition 1, offset:38
Message send to partition 0, offset:39
Message send to partition 0, offset:40
Message send to partition 0, offset:41
Message send to partition 1, offset:39
Message send to partition 1, offset:40
Message send to partition 0, offset:42
Message send to partition 0, offset:43

The producer end is very smooth to continue running, completely without any exception (but in fact Broker0 after the next message sent delay for a few seconds), can see the Kafka cluster failover mechanism is still very powerful

4) Let's start the Broker0 again.

Bin/kafka-server-start.sh-daemon config/server.properties

Then check the topic status again:

Topic:test partitioncount:2 replicationfactor:2 configs:
Topic:test partition:0 leader:1 replicas:1,0 isr:1,0
Topic:test partition:1 leader:1 replicas:0,1 isr:1,0

We see that the BROKER0 is up and is already in the In-sync state (note that the ISR has changed from 1 to 1,0), but at this point the leader of two partition is Broker1, which means that the current broker1 will host all the send and pull requests. This is obviously not possible, we want to restore the cluster to a load-balanced state.

At this time, an election needs to be triggered using the Kafka election tool:

bin/kafka-preferred-replica-election.sh--zookeeper 10.0.0.100:2181,10.0.0.101:2181,10.0.0.102:2181

Once the election is complete, check the topic status again:

Topic:test partitioncount:2 replicationfactor:2 configs:
Topic:test partition:0 leader:1 replicas:1,0 isr:1,0
Topic:test partition:1 leader:0 replicas:0,1 isr:1,0

As you can see, the cluster is back to the state it was in before Broker0 hung off.

But at this point, the producer side produces an exception:

Org.apache.kafka.common.errors.NotLeaderForPartitionException:This Server is a leader for that topic-partition.

The reason is that when the producer side tries to send a message to Broker1 's Parition0, Partition0 's leader has switched to BROKER0, so the message failed to send.

At this time with consumer to consume the message, you will find that the number of the message is not continuous, indeed leaked a message. This is because we set the retries=0 when constructing the producer, so the producer does not try to resend when the send fails.

Change retries to 3 and try again, you will find that the same problem occurs again leader switch, but the producer of the re-launch mechanism played a role, the message was re-sent successfully, the start consumer-side check also confirmed that all the messages were sent successfully.

After each cluster single point of failure recovery, there is a need for re-election to completely restore the cluster's leader allocation, if it is troublesome every time, You can configure auto.leader.rebalance.enable=true in the broker's configuration file (that is, server.properties) so that the broker will automatically re-elect after it is started

At this point, we have tested and verified that the producer can keep working correctly in the process of single points of failure and recovery of the cluster. Now let's look at the performance of the consumer side:

5) simultaneously start the producer process and the consumer process

At this time producer side in the production of messages, consumer side in the consumer message

6) Kill the Broker0 and observe the output of the consumer end:

Can see, after the Broker0 hangs, the consumer end produces a series of info and warn output, but after a few seconds, automatically recover, the message is still continuous, there is no breakpoint.

7) Start the BROKER0 again and trigger the re-election, then observe the output:

Fetched from partition 0, offset:418, message:this are message48
fetched from partition 0, offset:419, message:th IS is message49
[main] INFO Org.apache.kafka.clients.consumer.internals.consumercoordinator-offset Commit for group 1 failed due to Not_ Coordinator_for_group, would find new coordinator and retry
[main] INFO Org.apache.kafka.clients.consumer.internals.abstractcoordinator-marking the Coordinator 2147483646 dead.
[main] WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator- Auto Offset Commit Failed:this is a correct coordinator for this group.
fetched from partition 1, offset:392, message:this are message50
fetched from partition 0, offset:420, message:t He is message51

Can see, re-election after the consumer end also output some logs, meaning that when the offset was submitted found that the current scheduler has been invalidated, but quickly regained the new effective scheduler, the automatic return of the offset auto-commit, Verifying the value of the submitted offset also proves that the offset submission did not cause an error due to leader switching.

As above, we also verified the function correctness of the consumer terminal when the single point of failure occurred in Kafka cluster.

Through testing, we fully validated the high availability of the Kafka cluster. This article ends here.

This article is from the "OCD Severe patients" blog, please be sure to keep this source http://kelgon.blog.51cto.com/1735342/1758999

Kafka 0.9+zookeeper3.4.6 Cluster Setup, configuration, new Java Client Usage Essentials, high availability testing, and various pits (ii)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Kafka 0.9+zookeeper3.4.6 Cluster Setup, configuration, new Java Client Usage Essentials, high availability testing, and various pits (ii)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Kafka 0.9+zookeeper3.4.6 Cluster Setup, configuration, new Java Client Usage Essentials, high availability testing, and various pits (ii)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support