Introduction to Kafka's C + + high-performance client Librdkafka

Source: Internet
Author: User
Tags message queue

Librdkafka is a Apachekafka high-performance client for the C language implementation, providing efficient and reliable clients for production and use of Kafka, and provides C + + interface

Performance:

Librdkafka is a high-performance library designed for modern hardware use, which attempts to keep memory replication to a minimum, allowing users to decide whether they need high throughput or low latency services, and the two most important configurations for performance tuning are:

*batch.num.messages: accumulates the minimum number of messages waiting in the local queue before sending a message.

*queue.buffering.max.ms: How long to wait for batch.num.messages to fill in the local queue.

Use:

The source of Rdkafka.h, CONFIGURATION.MD has Librdkafka API description

Initialization

The application needs to instantiate a top-level object rd_kafka_t as the underlying container, providing global configuration and shared state, and calling Rd_kafka_new () to create.

You also need to instantiate one or more topics (' rd_kafka_topic_t ') to provide production or consumption, topic objects to save topic specific configurations, and populate all available partitions and leader brokers internally, by calling ' Rd_kafka_topic_new () ' created.

' rd_kafka_t ' and ' rd_kafka_topic_t ' bring an optional configuration API, if not called Api,librdkafka will use configuration.md the default configuration in the.

Attention

1. Applications may create multiple ' rd_kafka_t ' objects, and they do not share any state

2. A ' rd_kafka_topic_t ' object can only be used to create its ' rd_kafka_t ' object

Configuration

To simplify integration with the Apache Kafka official software and reduce the learning curve, Librdkafka implements the same configuration attributes as the Apache Kafka official client.

Use ' Rd_kafka_conf_set () ' and ' Rd_kafka_topic_conf_set () ' to apply the configuration before creating the object.

Attention:

' Rd_kafka. _conf_t ' objects are passed to Rd_kafka. _new () ' cannot be reused after calling ' Rd_kafka '. _new () ' , the application does not need free any configuration resources.

Example

[CPP]View PlainCopy
    1. rd_kafka_conf_t*conf;
    2. Char errstr[512];
    3. conf = Rd_kafka_conf_new ();
    4. Rd_kafka_conf_set (conf, "Compression.codec","snappy", Errstr, sizeof (ERRSTR));
    5. Rd_kafka_conf_set (conf, "Batch.num.messages", "n", Errstr, sizeof (ERRSTR));
    6. Rd_kafka_new (rd_kafka_producer,conf);



Threads and Callback functions

Librdkafka uses multiple threads internally to take full advantage of hardware resources.

The API is thread-safe, and applications can invoke arbitrary API functions within their threads at any time.

The Poll-based API is used to provide signals to the application, and the application periodically calls ' Rd_kafka_poll () ', and the poll API will invoke the following API:

* Message Delivery Report callback function: A signal for successful or failed message delivery that allows the application to release any application resources used in the message.

* ERROR callback function: Send an error signal, these errors are usually informational in nature, such as connection Broker failure, the application usually does not need to do any processing, the wrong type is passed by the ' rd_kafka_resp_err_t ' enumeration value, Includes remote broke errors and local errors.

An optional callback is not triggered by poll and can be called by any thread:

*logging callback: allows the application to output Librdkafka generated log messages

*partitioner Callback: application provided message partition can be called at any time, any thread, for the same key, can be called multiple times

Brokers

Librdkafka requires at least one brokers initialization list, called ' bootstrap brokers ', via the ' metadata.broker.list ' configuration property or ' Rd_ Kafka_brokers_add () ' to specify, to connect all bootstrapbrokers, and to query the information for each metadata, which contains brokers, topic, partitions, and their Kafka A complete list of leaders in cluster,

The name of the brokers is specified as "host[:p ort]", the port is optional (default 9092), host is the hostname or IP address, and if the host resolves to more than one address, librdkafka polls each attempt to connect the address, therefore, You can use DNS records that contain all brokers addresses to provide a reliable bootstrap broker.

Producer API

After setting the ' rd_kafka_t ' object with ' Rd_kafka_producer ' and setting one or more ' rd_kafka_topic_t ' objects, Librdkafka is ready to receive the message to be sent to brokers.

The ' rd_kafka_produce () ' function has the following parameters:

* ' Rkt '- requires produce topic, previously created by ' rd_kafka_topic_new () ' function

* ' partition '- produced to the partition, if set to ' Rd_kafka_partition_ua '(UnAssigned), then the configured partition function will be used to select the target partition.

* ' msgflags ' -0, or:

* ' rd_kafka_msg_f_copy ' -Librdkafka will immediately generate a copy of payload when payload is used in non-persisted memory (such as heap).

* ' Rd_kafka_msg_f_free ' -Librdkafka will be released using ' Free (3) ' after using payload.

These two indicators are mutually exclusive, and if neither copy nor free is required, then none of the two indicators need to be set.

If ' rd_kafka_msg_f_copy ' is not set, the data will not be copied, and Librdkafka will hold the payload pointer until the message is successfully transmitted or the transmission fails.

When Librdkafka completes the delivery of the message, allowing the application to regain ownership of the payload memory, the delivery report callback function will be called

If ' Rd_kafka_msg_f_free ' is set, the delivery report callback function cannot free the payload

* ' payload ', ' len '- the message of the payload

* ' key ', ' Keylen '- message key that can be used for message partitioning

It will be passed to the topic partition callback function (if present) and appended to the message when it is sent to the broker

* ' Msg_opaque ' -the application provides an optional opaque pointer for each message that is provided in the message callback function, allowing the application to reference a specific pointer.

' rd_kafka_produce () ' is a non-blocking API that arranges messages in the internal queue and returns immediately. If the number of messages arranged exceeds the "queue.buffering.max.messages" configuration item,' Rd_kafka_produce () ' returns-1 and sets the errno to ' Enobufs ', Thus providing a back pressure mechanism

Simple Consumer API

Note: For advanced Kafkaconsumer interfaces, view Rd_kafka_subscribe (rdkafka.h) or Kafkaconsumer (rdkafkacpp.h).

After using ' Rd_kafka_consumer ' and ' rd_kafka_topic_t ' instances to create ' rd_kafka_t ', the application must also call ' Rd_kafka_consume_start () ' To start consumer for a given partition.

' Rd_kafka_consume_start () ' parameter:

* ' Rkt '- needs to be consumed by topic, previously created by ' rd_kafka_topic_new () '.

* ' partition '- from which partition to consume

* ' offset '- start consuming the message offset, which may be an absolute message offset or one of two special offsets:

' rd_kafka_offset_beginning ' : Consume from the beginning of the partition queue (oldest message)

' Rd_kafka_offset_end ': Start spending on the next message to be produced to the partition

' rd_kafka_offset_stored ': Using the stored OFFSET

After a topic+partition consumer is started, Librdkafka will attempt to keep the "queued.min.messages" message in the local queue by repeatedly getting batch messages from the broker. This local message queue is then passed to the application through three different consume APIs:

* ' Rd_kafka_consume () ' -consume single message

* ' Rd_kafka_consume_batch () ' -consume one or more messages

* ' Rd_kafka_consume_callback () ' -consume all messages in the local queue and invokes a callback function for each message

These three APIs are ranked by performance in ascending order, ' rd_kafka_consume () ' Slowest, ' rd_kafka_consume_callback () ' Fastest.

Use the ' rd_kafka_message_t ' type to identify a consumed message whose members are:

* ' err '- sends back the error signal of the application, if not 0, then the ' payload ' member will be considered an error message, ' Err ' is the error code (' rd_kafka_resp_err_t '), if 0, ' payload ' Contains the message data.

* ' Rkt ', ' partition ' -the message of topic and partition

* ' payload ', ' len ' -payload message, or error message (err!=0)

* ' key ', ' Key_len ' -optional message specified by producer key

* ' offset ' -Message offset

' Payload ' and ' key ' as well as the entire memory of the message, belonging to the Librdkafka, call ' Rd_kafka_message_destroy () ' after not re-use, Librdkafka will share the same message set for all messages payloads the message set to receive buffer memory to avoid over-copying, which means that if the application decides to hang on a single rd_kafka_message_t, It will block backup memory from releasing all other messages from the same message set.

When the application finishes the message consumption from topic+partition, it needs to call ' Rd_kafka_consume_stop () ' to stop the consumer, which also clears the current message in the local queue.

Offset Management

Broker version >= 0.9.0 combined with a high-version Kafkaconsumer interface for broker-based offset management (view rdkafka.h or Rdkafkacpp.h)

Offset management can also be implemented via local file storage, with the following topic configuration parameters, offset being permanently written in the local file:

* ' Auto.commit.enable '

* ' auto.commit.interval.ms '

* ' Offset.store.path '

* ' offset.store.sync.interval.ms '

There is currently no support for offset management for zookeeper.

Consumer groups

Librdkafka supports broker-based consumer groups when Kafka broker version >= 0.9

Topics

Librdkafka Support auto-creation Topic,broker need to configure "Auto.create.topics.enable=true"

Introduction to Kafka's C + + high-performance client Librdkafka

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.