Kafka Producer Consumer, kafkaproducer
Producer API
Org. apache. kafka. clients. producer. KafkaProducer
1 props.put("bootstrap.servers", "192.168.1.128:9092"); 2 props.put("acks", "all"); 3 props.put("retries", 0); 4 props.put("batch.size", 16384); 5 props.put("linger.ms", 1); 6 props.put("buffer.memory", 33554432); 7 props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer"); 8 props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer"); 9 10 Producer<String, String> producer = new KafkaProducer<String, String>(props);11 for (int i = 0; i < 10; i++) {12 producer.send(new ProducerRecord<String, String>("foo", Integer.toString(i), Integer.toString(i)), new Callback() {13 @Override14 public void onCompletion(RecordMetadata recordMetadata, Exception e) {15 if (null != e) {16 e.printStackTrace();17 }else {18 System.out.println("callback: " + recordMetadata.topic() + " " + recordMetadata.offset());19 }20 }21 });22 }23 producer.close();
A producer is composed of a buffer pool that maintains records that have not been transferred to the server, in addition, there is a background I/O thread responsible for converting these records into requests and transmitting them to the cluster.
The send () method is asynchronous. After it is called, the record is put into the buffer and immediately returned. This allows the producer to batch send records.
The acks configuration item controls the criteria for completion, that is, what kind of requests are considered complete. In this example, the value is set to "all", indicating that the client will wait until all the records are completely submitted. This is the slowest way and the best way of persistence.
If the request fails, the producer can retry automatically. Here we set retries to 0, so it does not retry.
The producer maintains a buffers for each partition, where the records are not sent. The size of these buffers is controlled by the batch. size configuration item.
By default, a buffer will be sent immediately even if it has unused space (PS: buffer is not full. To reduce the number of requests, set linger. ms to a value greater than 0. This command will tell the producer how many milliseconds to wait before sending the request, in the hope that more records will arrive at the buffer. In this example, we set 1 ms, which means that our request will be sent with a delay of 1 ms. This is done to wait for more records to arrive, after 1 ms, the request will be sent even if the buffer is not filled up. (PS: For a brief explanation, the producer calls the send () method to only put records in the buffer, and then a background thread transmits the records in the buffer to the server. The request here refers to the buffer to the server. By default, records are immediately sent to the server after being put into the buffer. To reduce the number of requests to the server, you can set the linger. ms. This configuration item indicates the number of milliseconds before sending. In this way, you want to send more records for each request to reduce the number of requests)
Buffer. memory controls the total number of buffer memory
key.serializer
AndValue. serializer indicates how to convert key and value objects into bytes.
Since kafka 0.11, KafkaProducer supports two models: the idempotent producer and the transactional producer (idempotent producer and transaction producer ). Idempotent producer emphasizes at least one accurate delivery. Transaction producer allows an application to send messages to multiple partitions or topics.
To enable idempotence, you must set enable. idempotence to true. If this is set, retries is Integer. MAX_VALUE by default, and acks is all by default. To take advantage of idempotent producer, avoid application-level resend.
To use the transaction producer, you must configure transactional. id. If transactional. id is set, idempotence is automatically enabled.
1 Properties props = new Properties(); 2 props.put("bootstrap.servers", "192.168.1.128:9092"); 3 props.put("transactional.id", "my-transactional-id"); 4 5 Producer<String, String> producer = new KafkaProducer<String, String>(props, new StringSerializer(), new StringSerializer()); 6 7 producer.initTransactions(); 8 9 try {10 producer.beginTransaction();11 12 for (int i = 11; i < 20; i++) {13 producer.send(new ProducerRecord<String, String>("bar", Integer.toString(i), Integer.toString(i)));14 }15 // This method will flush any unsent records before actually committing the transaction16 producer.commitTransaction();17 } catch (ProducerFencedException | OutOfOrderSequenceException | AuthorizationException e) {18 producer.close();19 } catch (KafkaException e) {20 // By calling producer.abortTransaction() upon receiving a KafkaException we can ensure21 // that any successful writes are marked as aborted, hence keeping the transactional guarantees.22 producer.abortTransaction();23 }24 25 producer.close();
Consumer API
Org. apache. kafka. clients. consumer. KafkaConsumer
Offsets and Consumer Position
For each record in the partition, kafka maintains a numerical offset. This offset is the unique identifier of a record in a partition and the position of the consumer in the partition. For example, if the position of a consumer in the partition is 5, it indicates that it has consumed the records whose offset ranges from 0 to 4, and then it consumes the records whose offset is 5. Compared with consumer users, there are actually two concepts.
The consumer's position indicates the offset of the next record to be consumed. This position is automatically added each time the consumer receives a message by calling poll (long.
Committed position indicates the last offset that has been stored. Consumers can automatically and periodically submit offsets, or call the submit API (e.g.commitSync
AndcommitAsync
) Manual submission position.
Consumer Groups and Topic Subscriptions
Kafka uses the concept of "consumer groups" (consumer group) to allow a group of processes to process and consume records separately. These operations can be performed on the same machine or on different machines. Consumer instances in the same consumer group have the same group. id.
Each consumer in the group can dynamically set the list of topics they want to subscribe. Kafka delivers a message to each consumer group subscribed. This is due to the balanced partition among all members in the consumer group, so that each partition can be specified to a precise consumer in the group. Assume that a topic has four partitions and a group has two consumers. Each consumer processes two partitions.
The members in the consumer group are dynamically maintained: if a consumer fails to process the data, the partitions allocated to the consumer group will be re-allocated to other consumers in the group.
In terms of concept, you can think of a consumer group as a single logical subscriber, and each logical subscriber is composed of multiple processes. As a multi-subscription system, Kafka naturally supports any number of consumer groups for a given topic.
Automatic Offset Committing
1 Properties props = new Properties(); 2 props.put("bootstrap.servers", "192.168.1.128:9092"); 3 props.put("group.id", "test"); 4 props.put("enable.auto.commit", "true"); 5 props.put("auto.commit.interval.ms", "1000"); 6 props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); 7 props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); 8 KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props); 9 consumer.subscribe(Arrays.asList("foo", "bar"));10 while (true) {11 ConsumerRecords<String, String> records = consumer.poll(100);12 for (ConsumerRecord<String, String> record : records) {13 System.out.printf("offset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value());14 }15 }
Setting enable. auto. commit means that the offset of the consumed record is automatically submitted.
Manual Offset Control
Instead of the consumer's periodic submission of consumed offsets, the user can control when the records are deemed to have been consumed and Their offsets are submitted.
1 Properties props = new Properties(); 2 props.put("bootstrap.servers", "localhost:9092"); 3 props.put("group.id", "test"); 4 props.put("enable.auto.commit", "false"); 5 props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); 6 props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); 7 KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props); 8 consumer.subscribe(Arrays.asList("foo", "bar")); 9 final int minBatchSize = 200;10 List<ConsumerRecord<String, String>> buffer = new ArrayList<>();11 while (true) {12 ConsumerRecords<String, String> records = consumer.poll(100);13 for (ConsumerRecord<String, String> record : records) {14 buffer.add(record);15 }16 if (buffer.size() >= minBatchSize) {17 insertIntoDb(buffer);18 consumer.commitSync();19 buffer.clear();20 }21 }
Code demo
Server
Client
Reference
Http://kafka.apache.org/10/javadoc/index.html? Org/apache/kafka/clients/producer/KafkaProducer.html
Http://kafka.apache.org/10/javadoc/index.html? Org/apache/kafka/clients/consumer/KafkaConsumer.html