The compilation, installation and function introduction of the C + + client library Librdkafka under Linux Kafka

Source: Internet
Author: User
Tags call back message queue


Https://github.com/edenhill/librdkafka
Librdkafka is an open source Kafka client/C + + implementation, providing Kafka producer, consumer interface.

I. Installation of LIBRDKAFKA
First in the GitHub download Librdkafka source code, after decompression to compile;
CD Librdkafka-master
chmod 777 Configure lds-gen.py
./configure
Make
Make install
In make, if the 64-bit Linux will report the following exception
/bin/ld:librdkafka.lds:1: syntax error in VERSION script
As long as Makefile.config 46th line inside the With_lds=y this line of comments out will not be an error.



Comment out: #WITH_LDS =y, and then make



The final header file and library file are installed separately in the



/usr/local/include/librdkafka
/usr/local/lib






Second, call Librdkafka Library autonomous programming



Compile the user's own application and add-LRDKAFKA-LZ-LPTHREAD-LRT to these options.



For example, I use the Qtcreator qmake mode,. Pro files are as follows:








Qmake_lflags + =-lrdkafka-lrdkafka++-lz-lpthread-lrt
#-lrdkafka equivalent to LIBS + =/usr/local/lib/librdkafka.so
The compilation passes, but the runtime complains: Error while loading shared Libraries:librdkafka.so.1:cannot open Shared object file:no such file or Direc Tory
In this case, you need to add the directory where the librdkafka.so resides in/etc/ld.so.conf:/usr/local/lib/
Then execute the command at the terminal to take effect:


[Root@localhost etc]# Ldconfig



Note that every time the/usr/local/lib/has a library file update, the terminal needs to re-run ldconfig this command.



Third, start Kafka



For more information on my article: My personal kafka_2.12-1.0.0 Practice: Installation and Testing (★FIRECAT recommended ★)






Iv. usage Introduction, source files from/librdkafka-master/examples/rdkafka_example.cpp and Rdkafka_consumer_example.cpp



How to use producer:






To create a Kafka client configuration placeholder:
conf = Rd_kafka_conf_new (); That is, create a Configuration object (rd_kafka_conf_t). And through the Rd_kafka_conf_set to brokers configuration.



To set a callback for information:
The success or failure of sending feedback information. Through RD_KAFKA_CONF_SET_DR_MSG_CB (conf, DR_MSG_CB);



To create a producer instance:
1) Initialization:
The application needs to initialize the underlying container for a top-level object (rd_kafka_t) for global configuration and shared state.
Created by calling Rd_kafka_new (). Once created, the instance occupies the Conf object, so the Conf objects cannot be reused after the rd_kafka_new () call, and the configuration resource is not required to be freed after the Rd_kafka_new () call.
2) Create topic:
The created Topi object is reusable (Producer instantiation object (rd_kafka_t) is also allowed to be reused, so there is no need to create them frequently)
Instantiate one or more topic (rd_kafka_topic_t) for production or consumption.
The topic object preserves topic-level properties and maintains a mapping,
This mapping saves all available partition and their leader broker.
Created by calling Rd_kafka_topic_new () (Rd_kafka_topic_new (RK, topic, NULL);).
Note: Both rd_kafka_t and rd_kafka_topic_t are derived from the optional configuration API.
Not using this API will cause Librdkafka to use the default configuration listed in the document CONFIGURATION.MD.


3) Producer API:
By calling Rd_kafka_producer to set one or more rd_kafka_topic_t objects, you are ready to receive messages and assemble and send to broker.
The Rd_kafka_produce () function accepts the following parameters:
RKT: Topic to be produced, previously generated by Rd_kafka_topic_new ()
Partition: Production of partition. If set to Rd_kafka_partition_ua (unassigned), a certain PARTITION is selected according to the Builtin Partitioner. Kafka will call back Partitioner for a balanced selection, Partitioner method needs to be implemented by itself. You can poll or pass in key for hash. If not implemented, the default stochastic method is used to rd_kafka_msg_partitioner_random random selection.
You can try to design a partition value by Partitioner.
msgflags:0 or the following value:
Rd_kafka_msg_f_copy says Librdkafka make a copy from payload immediately before the message is sent. If payload is an unstable store, such as a stack, you need to use this parameter. This is to prevent the message body's cache from being used for a long time before copying the information beforehand.
Rd_kafka_msg_f_free says that when payload is finished, let Librdkafka use Free (3) to release. is to release the message cache after the message is used.
The two flags are mutually exclusive, and if they are not set, the payload is neither copied nor released by Librdkafka.
If the RD_KAFKA_MSG_F_COPY flag is not set, there will be no copy of the data, Librdkafka will occupy the payload pointer (the message body) until the message is sent or failed. Librdkafka the message is processed, the Send report callback function is invoked to allow the application to regain ownership of the payload.
If Rd_kafka_msg_f_free is set, the application does not release payload in the Send report callback function.
Payload,len: Messages Payload (message payload, value), message length
Key,keylen: Optional message key and its length, for partitioning. will be used for the topic partition callback function and, if so, will be appended to the message to send to broker.
Msg_opaque: Optional, an untyped pointer provided by the application to each message, provided to a message-sending callback function for application references.


Rd_kafka_produce () is a non-blocking API that plugs a message into an internal queue and returns immediately.
If the number of messages in the queue exceeds the value configured by the Queue.buffering.max.messages property, Rd_kafka_produce () returns the error by returning-1 and setting errno to the error code such as Enobufs.
Hint: see EXAMPLES/RDKAFKA_PERFORMANCE.C to obtain the use of the producer. How to use consumer:



The consumer API is a bit more stateful than the producer API. Creates a rd_kafka_t object using the Rd_kafka_consumer type (the function argument set when the call Rd_kafka_new is invoked), and then by calling Rd_kafka_brokers_add KAFKA handle (RK) on the above new The addition of broker (Rd_kafka_brokers_add (RK, brokers)),
Then, after creating the Rd_kakfa_topic_t object,



Rd_kafka_query_watermark_offsets



Create topic:
RTK = rd_kafka_topic_new (RK, topic, topic_conf)



Start consumption:
Invoke the Rd_kafka_consumer_start () function (Rd_kafka_consume_start (RKT, Partition, Start_offset)) to start partition for a given consumer.
The parameters required to invoke Rd_kafka_consumer_start are as follows:
Rkt: The topic to be consumed, previously created by Rd_kafka_topic_new ().
Partition: The partition to consume.
Offset: The message offset at which consumption starts. Can be an absolute value or two special offsets:
Rd_kafka_offset_beginning consumption from the partition queue (the oldest message).
Rd_kafka_offset_end begins to consume the next message from the partition.
rd_kafka_offset_stored use offset storage.



When a topic+partition consumer is started, Librdkafka constantly tries to get messages from the broker batch to keep the queued.min.messages number of messages in the local queue.
Local Message Queuing provides services to applications through 3 different consumer APIs:





    Rd_kafka_consume ()-Consumes a message
    rd_kafka_consume_batch ()-consumes one or more messages
    rd_kafka_consume_callback ()-consumes all messages in the local queue, And each one calls the callback function


The performance of these three APIs is in ascending order, Rd_kafka_consume () slowest, rd_kafka_consume_callback () the fastest. Different types meet the needs of different applications.
Messages consumed by the above function return the rd_kafka_message_t type.
Members of the rd_kafka_message_t:





* Err-Returns an error signal to the application. If the value is not 0, the payload field should be an error message, err is an error code (rd_kafka_resp_err_t).
* rkt,partition-message topic and partition or error.
* Payload,len-the data of the message or the wrong message (err!=0).
* Key,key_len-Optional Message key, producer designation.
* Offset-message offset.


Both payload and key memory, or the entire message, are owned by Librdkafka and are not used after Rd_kafka_message_destroy () is invoked.
Librdkafka to avoid redundant copies of message sets, the same message set is shared for all messages received from the memory cache, which means that if the application retains a single rd_kafka_message_t, it will prevent memory from being freed and used for other messages in the same message set.
When an application consumes messages from a topic+partition, it should call Rd_kafka_consume_stop () to end the consumption. The function also empties all messages in the current local queue.
Hint: see EXAMPLES/RDKAFKA_PERFORMANCE.C to obtain consumer's use.



The server.properties file configuration (parameter log.dirs=/data2/logs/kafka/) in Kafka Broker causes the topic in the message queue to be stored in the form of the partition under that directory. Each partition partition is composed of segment file, and segment file includes 2 parts: The index file and data file respectively, the 2 file one by one corresponds, the suffix ". Index" and ". Log" Represented as segment index files, data files, respectively.
Segment file naming rules: Partion The first segment of the global starting from 0, followed by each segment file named the offset value of the last message in the previous segment file. The maximum value is 64-bit long, 19-bit digit character length, and no number is populated with 0. Specific examples:



This article is used in the CPP method, and the above mentioned is only the function of the use of the different, business logic is the same.
In the producer process is directly using Partition_ua but in the consumption, can not specify partition value for Partition_ua because the value is actually-1, for the consumer end, is meaningless. According to the source can know when not specified partitioner, in fact there is a default partitioner, is consistent-random partitioner so-called consistent random partitioner. A consistent hash maps a keyword to a specific partition after the map is mapped.
Function Prototypes:





Rd_kafka_msg_partitioner_consistent_random (
           const rd_kafka_topic_t *rkt,
           const void *key, size_t Keylen,
           int32_t partition_cnt,
           void *opaque, void *msg_opaque);


Partition_ua is actually the meaning of unassigned PARTITION, that is, an unassigned partition. Rd_kafka_partition_ua (unassigned) is in fact the automatic use of topic under the Partitioner function, of course, can also directly use a fixed value.
In the configuration file config/server.properties, you can set the number of partition num.partitions. Assigning Partitions



When allocating partitions, be aware. For a topic that has already created a partition, and the partition has already been specified, then the producer code, if it is directly modifying the code of the Partitioner part, directly introducing the key value for partition redistribution, is not going to work, and continues to be added according to the previous partition ( The previous partition is partition 0, only one. At this point, if you view partition_cnt in your program, we can see that the value does not change because of config/server.properties changes, because the partition_cnt at this time is topic for the theme that has already been created.
If the partition_cnt in the code is still being used to compute the partition value: Djb_hash (Key->c_str (), key->size ())% 5 will get the following result: The hint partition does not exist.


We can view the number of partition under a topic by Rdkafka_example:
./rdkafka_example-l-T Helloworld_kugou-b localhost:9092






From this we can see that the Helloworld_kugou theme has only one partition, and the HELLOWORLD_KUGOU1 theme has 5 partition, which corresponds to what we expected.
We can modify its partitions on a theme that has already been created:
./bin/kafka-topics.sh--zookeeper 127.0.0.1:2181--alter--partition 5--topic Helloworld_kugou
After the modification, we can see that the Helloworld_kugou has become 5 partitions.


Specific examples:




Create topic as helloworld_kugou_test,5 a partition. We can see that there are already 5 partition in the producer log before the input is entered in the front end:




producer End Code:


class ExampleDeliveryReportCb: public RdKafka :: DeliveryReportCb
{
 public:
  void dr_cb (RdKafka :: Message & message) {
    std :: cout << "Message delivery for (" << message.len () << "bytes):" <<
        message.errstr () << std :: endl;
    if (message.key ())
      std :: cout << "Key:" << * (message.key ()) << ";" << std :: endl;
  }
};


class ExampleEventCb: public RdKafka :: EventCb {
 public:
  void event_cb (RdKafka :: Event & event) {
    switch (event.type ())
    {
      case RdKafka :: Event :: EVENT_ERROR:
        std :: cerr << "ERROR (" << RdKafka :: err2str (event.err ()) << "):" <<
            event.str () << std :: endl;
        if (event.err () == RdKafka :: ERR__ALL_BROKERS_DOWN)
          run = false;
        break;

      case RdKafka :: Event :: EVENT_STATS:
        std :: cerr << "\" STATS \ ":" << event.str () << std :: endl;
        break;

      case RdKafka :: Event :: EVENT_LOG:
        fprintf (stderr, "LOG-% i-% s:% s \ n",
                event.severity (), event.fac (). c_str (), event.str (). c_str ());
        break;

      default:
        std :: cerr << "EVENT" << event.type () <<
            "(" << RdKafka :: err2str (event.err ()) << "):" <<
            event.str () << std :: endl;
        break;
    }
  }
};

/ * Use of this partitioner is pretty pointless since no key is provided
 * in the produce () call.so when you need input your key * /
class MyHashPartitionerCb: public RdKafka :: PartitionerCb {
    public:
        int32_t partitioner_cb (const RdKafka :: Topic * topic, const std :: string * key, int32_t partition_cnt, void * msg_opaque)
        {
            std :: cout << "partition_cnt =" << partition_cnt << std :: endl;
            return djb_hash (key-> c_str (), key-> size ())% partition_cnt;
        }
    private:
        static inline unsigned int djb_hash (const char * str, size_t len)
        {
        unsigned int hash = 5381;
        for (size_t i = 0; i <len; i ++)
            hash = ((hash << 5) + hash) + str [i];
        std :: cout << "hash1 =" << hash << std :: endl;

        return hash;
        }
};

void TestProducer ()
{
    std :: string brokers = "localhost";
    std :: string errstr;
    std :: string topic_str = "helloworld_kugou_test"; // create your own topic topic
    MyHashPartitionerCb hash_partitioner;
    int32_t partition = RdKafka :: Topic :: PARTITION_UA;
    int64_t start_offset = RdKafka :: Topic :: OFFSET_BEGINNING;
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.