Heartbeat communication layer Structure Analysis
Preface
Heartbeat is the name of the HA software released by the Linux-HA open-source project for key application environments. From 1999 to now, it has gone through multiple versions such as 1.2.x and 2.0.x, and is widely used in the world's open-source ha field, it is also supported by some mainstream Linux operating system vendors.
The implementation of the communication layer is undoubtedly the most basic underlying support for cluster software operation. This article analyzes the source code of heartbeat and describes the basic structure and mechanism of the communication layer. The basic data structure and implementation process are provided.
All analyses are based on heartbeat 2.0.4.
Related Source: http://www.linux-ha.org/download/heartbeat-2.0.4.tar.gz
Heartbeat communication structure Overview
There are two main types:
1. hbcomm plugin (Process Communication between nodes)
The implementation is mainly in the plugin of various media and loaded through the pils dynamic Connection Library. For example, multicast, unicast, serial port, and other communication modes are supported. The plugin module for communications between all nodes is placed under lib/plugins/hbcomm/path.
2. UNIX domain socket (Process Communication in the node)
/Include/clplumbing/IPC. H, IPC abstraction layer data structure definition
/Lib/clplumbing/ocf_ipc.c, underlying abstract Implementation of IPC
/Lib/clplumbing/ipcsocket. C, specific implementation of the UNIX domain socket of IPC
Intersection:
The two communication methods between nodes and the node are implemented in functions such as read_child () and write_child () of heartbeat. C, where messages are forwarded.
Heartbeat API
It is based on the Unix domain Implementation of the IPC abstraction layer and is used to meet the application layer communication requirements between the heartbeat and client submodules. :
Client_lib.c implements the heartbeat API client.
Hb_api.c implements the heartbeat API server.
Http://www.gd-linux.org/liu_attach/4.11forum.jpg
Figure 1: Heartbeat communication structure Overview
Describes the process in which a client sub-module sends messages to the same module of another node through the heartbeat communication mechanism.
1. The client sub-module sends messages to the FIFO sub-process mongoo_child through the FIFO pipeline. Why is FIFO used for communication? Some processes cannot easily establish a Unix-domain IPC channel relationship with the master heartbeat process, such as scripts executed and cluster management programs, cluster status query program.
2. After receiving a message from the FIFO pipeline through msgfromstream (), the FIFO sub-process forwards the message to the heartbeat master process through the IPC channel established in advance with heartbeat.
3. If the main process determines that the message is sent to itself, it calls process_msg () for processing. Otherwise, send_to_all_media () is called and sent to the write_child sub-process through the wchan channel of each media.
4. The write_child sub-process receives a message from the master process through ipcmsgfromipc () and calls the write function of hb_media in each media structure to send the message to other nodes in the cluster.
5. The read_child sub-processes on other nodes read messages through the READ function of hb_media in each media structure, and send messages to the heartbeat master process using the IPC channel established with the heartbeat master process in advance.
6. After the heartbeat master process receives a message through msgfromipc (), it calls the process_clustermsg () function for processing. Specifically, hbdomsgcallback is called for processing messages processed by the main process; otherwise, newstartha_monitor is used to send messages to each client sub-process.
Inter-node communication plugin
The code is in the LIB/plugins/hbcomm/directory.
Bcast. c/* broadcast */
MCAST. c/* multicast */
Ucast. c/* unicast */
Openais. c/* openais */
Serial. c/* serial port */
Ping. c/* ICMP */
Ping_group.c/* ping a group of hosts */
Hbaping. c/* Optical Fiber bus adapter ping */
/* Each function in this structure corresponds to a specific function in the plugin. */
Struct hb_media_fns {
Struct hb_media * (* New) (const char * token);/* Create media */
INT (* PARSE) (const char * options);/* read configuration file parameters */
INT (* open) (struct hb_media * MP);/* Open */
INT (* close) (struct hb_media * MP);/* close */
Void * (* read) (struct hb_media * MP, int * Len);/* read */
INT (* write) (struct hb_media * MP, void * MSG, int Len);/* write */
INT (* mtype) (char ** buffer);/* obtain media type */
INT (* descr) (char ** buffer);/* Get Media Description */
INT (* isping) (void);/* Whether to ping media */
};
Function calls of hb_media_fns:
New (): add_option function of config. c
Parse (): parse_config function of config. c
Open (): initialize_heartbeat function of heartbeat. c
Close (): initialize_heartbeat function of heartbeat. c
Read (): read_child function of heartbeat. c
Write (): write_child function of heartbeat. c
Mtype (): parse_config function of config. c
Descr (): parse_config function of config. c
Isping (): called in config. C and hb_api.c
Intra-node IPC Communication
IPC communication abstraction layer (include/clplumbing/IPC. h)
Overview of the data structure at the IPC abstraction layer:
(Note: indented is the element of the data structure)
Ipc_auth/* security authentication data structure */
Ipc_wait_connection/* Wait for the connection data structure */
Ipc_wait_ops/* Wait for the connection function set */
Ipc_channel/* Data Structure of the Communication Pipeline */
Ipc_ops/* Communication Pipeline Function Set */
Ipc_queue/* information queue */
Ipc_bufpool/* receiving buffer pool, processed and converted to receiving queue */
Ipc_message/* IPC Communication Information Data Structure */
Ipc_channel/* communication channel to which information belongs */
Socket_msg_head/* Information header data structure */
There are two main abstract data structures:
/* The server waits for the client connection */
Struct ipc_wait_connection {
Int ch_status;/* Wait conn. Status .*/
Void * ch_private;/* Wait conn. Private Data .*/
Ipc_waitops * OPS;/* Wait conn. function table .*/
};
/* Structure of the Active Communication Pipeline */
Struct ipc_channel {
Int ch_status;/* channel status */
Pid_t farside_pid;/* remote PID */
Void * ch_private;/* channel private data. (may contain conn. info .)*/
Ipc_ops * OPS;/* channel function set */
Unsigned int msgpad;/* Number of Information prefix bytes */
Unsigned int bytes_remaining;/* Number of unsent bytes */
Gboolean should_send_block ;/**/
/* Private :*/
Ipc_queue * send_queue;/* Sending buffer */
Ipc_queue * recv_queue;/* Receive Buffer */
/* The receiving buffer pool, which is converted to the recv_queue of the receiving information queue after processing */
Struct ipc_bufpool * pool;/* buffer pool */
/* Send traffic control */
Int high_flow_mark;
Int low_flow_mark;
Void * high_flow_userdata;
Void * low_flow_userdata;
Flow_callback_t high_flow_callback;
Flow_callback_t low_flow_callback;
Int conntype;
Char failreason [maxfailreason];
};
IPC abstraction layer communication
Server:
1. Call ipc_wait_conn_constructor () to establish a connection pipeline. If the connection is successful, ipc_waitconnection is returned.
2. Poll customer requests through poll/select. Use accept_connection to accept the connection and return ipc_channel.
Client:
Call ipc_channel_constructor () to connect to the server and return ipc_channel.
Implementation of Unix domain socket in the IPC Abstraction Layer
Static struct ipc_ops socket_ops = {
Destroy: socket_destroy_channel,/* delete a Communication Pipeline */
Initiate_connection: socket_initiate_connection,/* establish a connection from the client */
Verify_auth: socket_verify_auth,/* client authentication information */
Assert_auth: socket_assert_auth,/* assert_auth, (unused )*/
Send: socket_send,/* send information to the pipeline */
Recv: socket_recv,/* receive information from the pipeline */
Waitin: socket_waitin,/* Wait for the input information (and then read )*/
Waitout: socket_waitout,/* wait for the end of information output */
Is_message_pending: socket_is_message_pending,/* readable or hung up with information */
Is_sending_blocked: socket_is_output_pending,/* Whether the output is blocked */
Resume_io: socket_resume_io,/* restore all possible IPC operations */
Get_send_select_fd: socket_get_send_fd,/* Get and send FD */
Get_recv_select_fd: socket_get_recv_fd,/* Get the received FD */
Set_send_qlen: socket_set_send_qlen,/* set the maximum sending buffer length */
Set_recv_qlen: socket_set_recv_qlen,/* set the maximum length of the received buffer */
Set_high_flow_callback: socket_set_high_flow_callback,/* high-traffic callback function */
Set_low_flow_callback: socket_set_low_flow_callback,/* low-traffic callback function */
New_ipcmsg: socket_new_ipcmsg,/* returns a newly created IPC information */
Get_chan_status: socket_get_chan_status,/* return the MPs queue status */
Is_sendq_full: socket_is_sendq_full,/* Whether the sending buffer is full */
Is_recvq_full: socket_is_recvq_full,/* Whether the receiving buffer is full */
Get_conntype: socket_get_conntype,/* return the MPs queue type */
/* It Can Be ipc_server, ipc_client, ipc_peer */
};
Inter-node communication plugin/intra-node communication Intersection
The main implementation code is in heartbeart. C.
Heartbeat communication media structure
Struct hb_media {
Void * PD;/* Custom Data Structure */
Const char * Name;/* media name */
Char * type;/* media type */
Char * description;/* Media Description */
Const struct hb_media_fns * VF;/* hbcomm media processing function set */
Ipc_channel * wchan [2];/* Unix domain write sub-process communication pipeline */
Ipc_channel * rchan [2];/* Unix domain read sub-process communication pipeline */
};
/* Heartbeat sending information cluster */
/* 1. send the message to the write_child sub-process */
Send_cluster_msg {/* send information to the cluster */
...
Process_outbound_packet {/* packet retransmission control */
Send_to_all_media {/* send to all media */
For (j = 0; j <nummedia; ++ J ){
Ipc_channel * wch = sysmedia [J]-> wchan [p_writefd];
...
/* Write sub-processes sent to specific media */
WRC = wch-> OPS-> send (wch, outmsg );
}
}
}
}
/* 2. write_child write the sub-process to send messages to the cluster */
Write_child (){
Ipc_channel * ourchan = MP-> wchan [p_readfd];
For (;;){
/* Write_child receives heartbeat information through Unix domain socket */
Ipc_message * ipcmsg = ipcmsgfromipc (ourchan);/* Call OPS-> Recv ()*/
...
/* Send to other nodes in the cluster */
If (MP-> VF-> write (MP, ipcmsg-> msg_body, ipcmsg-> msg_len )! = Ha_ OK ){
......
}
}
}
/* Receive information from the cluster */
/* 1. Read the sub-process of read_child to receive messages from the cluster */
Read_child (){
Ipc_channel * ourchan = MP-> rchan [p_readfd];
For (;;){
/* Receive from hbcomm plugin */
If (Pkt = MP-> VF-> Read (MP, & pktlen) = NULL ){
......
}
If (null! = Imsg ){
/* The read_child sub-process is sent to heartbeat through Unix domain socket */
Rc = ourchan-> OPS-> send (ourchan, imsg );
RC2 = ourchan-> OPS-> waitout (ourchan );
...
}
}
}
/* 2. Heartbeat receives and processes information from the read_child sub-process */
S = g_main_add_ipc_channel (pri_readpkt
, Sysmedia [J]-> rchan [p_writefd], false
, Read_child_dispatch, sysmedia + J, null );
Read_child_dispatch (){
...
MSG = msgfromipc (source, msg_needauth);/* Call OPS-> Recv () to read from read_child */
Process_clustermsg (MSG, lnk);/* process read information */
}
Heartbeat API Server
Struct api_query_handler query_handler_list [] = {
{Api_signoff, api_signoff},/* Client Login */
{Api_setfilter, api_setfilter},/* Set message filter */
{Api_setsignal, api_setsignal},/* set the message arrival signal notification */
{Api_nodelist, api_nodelist},/* Get node list */
{Api_nodestatus, api_nodestatus},/* query node status */
{Api_nodetype, api_nodetype},/* query node type */
{Api_ifstatus, api_ifstatus},/* query the heartbeat status */
{Api_iflist, api_iflist},/* queries the heartbeat list */
{Api_clientstatus, api_clientstatus},/* query the status of the client module */
{Api_numnodes, api_num_nodes},/* returns the number of common nodes in the cluster */
{Api_getparm, api_get_parameter},/* return a specific parameter value */
{Api_getresources, api_get_resources},/* returns the resource status (compatible with versions earlier than 1.2.x )*/
{Api_getuuid, api_get_uuid},/* Get the node UUID value */
{Api_getname, api_get_nodename},/* Get node name */
{Api_set_sendqlen, api_set_sendqlen}/* set the length of the sending queue */
};
Heartbeat API Client
Static struct llc_ops heartbeat_ops = {
Signon: hb_api_signon,/* register a new heartbeat client */
Signoff: hb_api_signoff,/* cancel a heartbeat client */
Delete: hb_api_delete,/* logout structure */
Set_msg_callback: set_msg_callback,/* sets a certain information type callback */
Set_nstatus_callback: set_nstatus_callback,/* sets the node status type callback */
Set_ifstatus_callback: set_ifstatus_callback,/* sets the heartbeat status type callback */
Set_cstatus_callback: set_cstatus_callback,/* sets the client status type callback */
Init_nodewalk: init_nodewalk,/* initialize node traversal */
Nextnode: nextnode,/* next node */
End_nodewalk: end_nodewalk,/* End Node traversal */
Node_status: get_nodestatus,/* current node status */
Node_type: get_nodetype,/* node type */
Init_ifwalk: init_ifwalk,/* initialize heartbeat traversal */
NextIf: nextIf,/* next heartbeat interface */
End_ifwalk: end_ifwalk,/* end heartbeat traversal */
If_status: get_ifstatus,/* Current heartbeat status */
Client_status: get_clientstatus,/* current client status */
Get_uuid_by_name: get_uuid_by_name,/* obtain UUID by name */
Get_name_by_uuid: get_name_by_uuid,/* Get the name based on UUID */
Sendclustermsg: sendclustermsg,/* send messages to all cluster members */
Sendnodemsg: sendnodemsg,/* send a message to a specific node */
Sendnodemsg_byuuid: sendnodemsg_byuuid,/* send a message to a specific node (by UUID )*/
Send_ordered_clustermsg: send_ordered_clustermsg,/* Sending order cluster information */
Send_ordered_nodemsg: send_ordered_nodemsg,/* Sending sequence node information */
Inputfd: get_inputfd,/* return and Detection Information Arrival */
Ipcchan: get_ipcchan,/* returns ipc_channel-type IPC channel */
Msgready: msgready,/* returns true if information is readable */
Setmsgsignal: hb_api_setsignal,/* setmsgsignal */
Rcvmsg: rcvmsg,/* Receives MSG and submits it to callback for processing */
Readmsg: read_msg_w_callbacks,/* returns MSG not registered with callback */
Setfmode: setfmode,/* setfmode */
Get_parameter: get_parameter,
Get_deadtime: get_deadtime,
Get_keepalive: get_keepalive,
Get_mynodeid: get_mynodeid,/* get the local node name */
Get_logfacility: get_logfacility,/* suggested logging facility */
Get_resources: get_resources,/* Get the current distribution of resources */
Chan_is_connected: chan_is_connected,
Set_sendq_len: set_sendq_len,/* set the sending cache length */
Set_send_block_mode: socket_set_send_block_mode,
Errmsg: apierror,
};
Note:
The client-side API function set is much larger than the server-side query and processing function set because some functions do not need to be obtained through server-side query.