Zookeeper C client Analysis

Source: Internet
Author: User

Each zookeeper API must have a zhandle. When a zhandle (zookeeper_init) is initialized, the corresponding fields of zhandle are initialized, and two threads are created: do_io and do_completion. The two threads are returned only after Initialization is complete, these two threads also need to wait for the initialization of each other to provide services (notify_thread_ready ).

1 Io thread/do_io

Obviously, this thread is used to process IO requests (using the poll multiplexing method). The IO requests here are network Io processing (zh-> FD) with the zookeeper server ). In addition to this Io, this thread also pays attention to the IO (adaptor_threads-> self_pipe [0]) of the Reading end of an MPS queue. The only function of this Io is to wake up the thread, that is, other threads wake this thread (wakeup_io_thread) by writing a character to adaptor_threads-> self_pipe [1 ). For the first type of Io, You need to determine the event of interest and the waiting time (zookeeper_interest); then enter poll to wait for the event to be available; When poll returns, you need to determine which events are available; clear adaptor_threads-> self_pipe [0]; finally, call zookeeper_process to process available events. For example, if the response to the synchronous request is sent, the result is saved directly to the Buf of the application in this thread, and wake up the application, for asynchronous requests, the completion object of sent_requests is put into the completions_to_process queue for the completion thread to process.

2 Completion thread/do_completion

This thread is responsible for extracting the completion from completions_to_process for processing. This processing includes processing the completion callback and watch callback. The processing function is process_completions.

3. API

Zookeeper APIs can be divided into three types, for example, zoo_get, zoo_wget, and zoo_awget. 'W' indicates whether to specify the watch callback function. If this parameter is specified, the watch callback function and the context parameter need to be input. 'A' indicates the Asynchronous Method, you need to specify the completion callback function and its data parameter, and this callback function is called by the completion thread (non-A indicates the synchronous mode, that is, the request is returned to the application only after it is returned from the service, during this period, application blocking waits ). For internal implementations, both get and wget are implemented through awget. First, get-> wget: If get requires watch, use the default ZH-> watcher to call wget; wget-> awget: Pass the completion callback function of synchronous_marker and call awget, and the imported data is the Buff of the application. (In the zookeeper_process function, we can see that if the completion function is synchronous_marker, it will be processed directly in the IO thread, and put the service response results in the data of the Completion), and then wget calls wait_sync_completion to wait for the IO thread to finish processing (from here we can see that the wget method is actually implemented internally using asynchronous methods, but it is different for the customer process, because the synchronization mode will be blocked, while the asynchronous mode will return after the request is sent, without waiting for the request to respond, in its own business logic, the customer process selects when to wait for the response results of the request, and the asynchronous callback function is implemented by the completion thread. The synchronous mode is processed by the IO thread (in fact, there is no callback function )).

Finally, let's take a look at how awget is implemented? It first serializes the request message, and then constructs the watch object (create_watcher_registration: including path, watch, wathc_ctx, checker = data_result_checker). Then, the watch object and completion, Data, H. the XId is encapsulated as a _ completion_list object (create_completion_entry) together, and the object is placed at the end of sent_requests (the XId sent must be consistent with the received Xid) (add_completion ); finally, put the serialized packet into to_send queues, and then wake up the IO thread to send the packet. Note that watch is not put into hashtable (this behavior is completed in activatewatcher ).

4. zookeeper network protocols

No matter what the watch requires only one flag between the client and the server, the specific watch callback function is saved in the hashtable of the client, the server will only tell the client the type of the change when the node marked as needing to be watched changes. The client then finds all its watch callback functions based on the type and node, and execute;

The client will specify an Xid for each packet. When the server responds to this packet, it returns the XId as is and predefines several special packets smaller than 0 (for example, watcher_event_xid, auth, ping, set_watches_xid. For the watcher_event_xid packet, the type is included to allow the client to determine the hasttable [node, exist, child] Saved by the watch callback function. then, the final callback function is generated through path hash. The others are normal data packets, which use type to mark their request behavior (such as getdata_op, exists_op, and create_op ), the service returns the corresponding data to the client based on this flag, while the client processes the data based on completion_type (the type of the waiting result is saved in the completion object.

Possible client status sequence: zoo_connecting_state-Prime req--> Zoo_associating_state-Prime resp--> Zoo_connected_state (zoo_expired_session_state: Different client_id )--Auth
Failed
--> Zoo_auth_failed_state. Red indicates the message,Prime reqIndicates that the client sends a message to the server,Prime resp, auth failedIndicates that the server is sent to the client.

5 main functions

Zookeeper_interest:

The function first checks whether the IO has been connected. If not, it initiates a connection to a zookeep server. In this case, it may enter the zoo_connecting_state state, after the connection is successful, the first prime message (prime_connection, which is sent directly without passing through poll) will be sent to the server to enter the zoo_associating_state state; if the connection has been established, calculate the time (acceptable time) it receives the next packet ), if this time is less than 0, it indicates that the request has timed out (calculate_recv_to received the message more than 2/3 * recv_timeout at the first time). If no timeout occurs, the ping packet sending time is calculated, if the sent Ping is reached (more than 1/3 * recv_timeout from the last ping), a ping packet is sent to ZH-> to_send. The final poll
Timeout is the minimum time of the preceding recv_to and send_to, that is, either the packet is received or the ping packet is sent. After next_deadline is updated. The event that cares about this Io is zookeeper_read (the server's response to ping). If ZH-> to_send has a response, the zookeeper_write event is also followed.

Zookeeper_process:

This function first checks the event (if it is in the zoo_connecting_state state and receives the zookeeper_write event, it sends the prime message to enter the zoo_associating_state State; if it is a zookeeper_write event, in addition, if there is a packet to be sent in the to_send cache, all the packets in to_send are sent. If the block is used, the system returns directly (flush_send_queue, which can also set a timeout wait time ); for the zookeeper_read event, the network packet is placed in ZH-> input_buffer cache. If the packet is a response to the prime, the initial ZH-> client_id is used for the next reconnection, at this time, the status changes to zoo_connected_state, and sends all AUTH Messages and watch messages to the server (the watch information is obtained from active_node_watchers, exist, and child). Finally, a zoo_session_event is constructed.
Watch completion is put into completions_to_process as the response to the zoo_session_event event (connection successful) event (queue_session_event). After the connection is established, you must construct a watch by yourself, the watch callback function is ZH-> watcher, which follows the path "" and directly calls process_completions to process the callback). If the received message is not a response to the prime, directly put the network packet into the to_process Queue). Next, process all the packets received in to_process, parse the packets, and determine the XId of the packet. If it is watcher_event_xid, there is a watch event, create a watcher_event_xid
Watch completion, and the callback function of this watch is determined by the type (such as zoo_session_event, created_event_def, etc.) and path (collectwatchers) specified by the server ), then, put the completion into the completions_to_process queue for the completions thread to process. If the received message is a response to set_watches_xid, It is not processed. If it is a response to auth_xid, auth_completion_func is called directly for processing, note that the verification result is not sent to completions_to_process. If the verification fails, it enters the zoo_auth_failed_state and exits. If the ping response is received, then the last_recv time can be updated. For other packets (such as the response to the get), we first determine whether the response packet is the currently waiting packet. Yes, it is discarded directly. In this case, the watch callback can be put into the corresponding hashtable (activatewatcher, that is, only after the request is responded and check is successful will the corresponding watch callback function be added to the corresponding hashtable), and then determine whether the previous request is synchronous or asynchronous, if it is asynchronous (using the * interface), put the competion of this request into completions_to_process; otherwise, it will be processed in this thread (non-A * interface does not have the completion callback function ), the processing here is to parse the message according to completion_type and put the parsing result into sync_completion. Finally, notify_sync_completion notifies the client that the response message has been received from the server, and the API can return it.

Process_completions

This function is the processing function of the completions thread. It is also directly called by other functions, such as queue_session_event. That is, the default ZH-> watcher callback is called after the connection is successful. It extracts the completions to be processed from the completions_to_process queue and processes them. First, it determines whether the received message is a watcher_event message. If yes, it calls all the callback functions (deliverwatchers) of this watch for execution. Otherwise, it is other response packets, the corresponding completion callback function is called Based on completion_type for processing.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.