Kernel and upper layer communication mechanism based on socket in Netlink-linux __linux

Source: Internet
Author: User
Tags goto set socket
netlink---Linux kernel and upper layer communication mechanism based on socketNeed to be in the Linux network card driver to add a own driver, implementation in the kernel state to complete some message processing (this process can achieve a 0 copy of the network message interception), for complex message copy the necessary data to the user state to complete (because too complex message consumption CPU too large, Can cause the interruption to take too long). Therefore, we need a communication mechanism of kernel and user state, try many ways are not ideal, finally adopt netlink+ memory mapping mode to solve this problem well. NetLink is a mechanism that uses socket communication to communicate with Linux kernel and upper level user space, through practice I think NetLink's biggest advantage is that it can realize "bidirectional communication", which is the best choice for the kernel to launch notification to user state.


There are several ways in which the kernel and user space can be communicated:
The kernel address is mapped to the user state in a memory-mapped manner. This approach is most straightforward and can be applied to a large number of data transmission mechanisms. The disadvantage of this approach is that it is difficult to "business control", there is no reliable mechanism to ensure that the kernel and user state of the synchronization, such as semaphores can not be used across the kernel, the user layer. Therefore, the memory mapping mechanism generally needs to cooperate with a "message mechanism" to control the data reading, such as using the "message" type of short data channel to complete a reliable data reading function.
IOCTL mechanism, the IOCTL mechanism can extend a specific IOCTL message in the drive to react some states from the kernel to the user state. IOCTL has a good data synchronization protection mechanism, do not worry about the kernel and user layer data access conflict, but IOCTL is not suitable for transmission of a large number of data, through the combination of memory mapping can be very good to complete a large number of data exchange process. However, the initiator of the IOCTL must be in the user state, so it is very troublesome to initiate a notification message to the user layer if the kernel state is required. User-state programs may be required to use polling mechanisms to keep ioctl.
Other ways, such as system calls must be initiated through the user state, Proc Way is not reliable and real-time, the output for debugging information is very appropriate.
In the context of the previous project, I need a notification method that can initiate a message in a kernel state, and a user-state program might want to wait for a message in a "blocking call" way. Such a model can save CPU scheduling to the maximum, at the same time can meet the requirements of timely processing, and finally chose the NetLink to complete the communication process.
The NetLink communication model and socket communication are very similar, the main points are as follows:
NetLink uses its own independent address code, struct SOCKADDR_NL; Each message sent through NetLink must be accompanied by a NetLink own message header, struct NLMSGHDR; The kernel state of the NetLink Operation API and user state is completely different, followed by the introduction of user-state NetLink operation is completed using the socket function, very convenient and simple, with TCP/UDP Socket Programming Foundation is very easy to use.

NetLink Communication Address and protocolCommunication between all sockets must have an address structure, and NetLink is no exception. We are most familiar with the address of IPV4, NetLink address structure is as follows:
[CPP]View plain copy struct SOCKADDR_NL {sa_family_t nl_family;             Must be Af_netlink or pf_netlink unsigned short nl_pad;             Must be 0 __u32 nl_pid;              Communication Port __u32 nl_groups; Multicast Mask};
A few of the above data, the most critical is nl_family (on the corresponding IP communication in the Af_inet) and Nl_pid.

Nl_pid is a agreed communication port, user state use when using a number not 0, in general can directly use the upper layer application process ID (no process ID number is OK, as long as the system does not conflict a number can be used). For the kernel address, the value must be 0, that is, if the upper layer sends the NETLINK message to the kernel via sendto, nl_pid must fill out 0 in peer addr.


Nl_groups for a message distributed to different receivers at the same time, is a multicast application, this article does not talk about multicast applications.


In essence, Nl_pid is NetLink's mailing address. In addition to the mailing address, NetLink also provides "protocol" to indicate the communication entity, and when creating the socket, you need to specify the NetLink communication protocol number. Each protocol number represents an "application" in which the upper layer can communicate with the kernel-defined protocols and kernels to obtain the information the kernel has already provided. The list of specifically supported protocols is as follows:
[CPP] View Plain copy #define  NETLINK_ROUTE       0   /*  routing/device hook              */    #define  netlink_unused      1   /* unused  number                */    #define  netlink_usersock    2   /* reserved for  user mode socket protocols  */   #define  NETLINK_FIREWALL     3   /* Firewalling hook              */   #define  NETLINK_INET_DIAG   4    /* INET socket monitoring            */   #define  netlink_nflog       5   /* netfilter /iptables ulog */   #define  NETLINK_XFRM         6   /* ipsec */   #define  NETLINK_SELINUX      7   /* SELinux event notifications */   #define  netlink _iscsi       8   /* open-iscsi */   # define netlink_audit       9   /* auditing */    #define  NETLINK_FIB_LOOKUP  10     #define  NETLINK_ connector   11   #define  netlink_netfilter   12  /*  netfilter subsystem */   #define  NETLINK_IP6_FW       13   #defIne netlink_dnrtmsg     14  /* decnet routing messages  */   #define  netlink_kobject_uevent  15  /* kernel messages  to userspace */   #define  NETLINK_GENERIC     16   /* leave room for netlink_dm  (dm events)  */   #define   netlink_scsitransport   18  /* scsi transports */   #define  NETLINK_ECRYPTFS    19  

The purpose of the protocol is well understood, for example, we simply create a top-level application that communicates with the Netlink_route protocol to obtain the kernel's routing information. I need to use NetLink to create a communication protocol of my own, so I define a new protocol. The definition of a new protocol cannot be defined as a conflict with the kernel, and cannot exceed the max_links of this macro, Max_links = 32. So I've defined the protocol number as 30.


Summary: NetLink uses the protocol number + the communication port the way constructs own address system.
User state operation NetLink socketThe basic process of creating a NetLink socket in user state is exactly the same as the API for operating other sockets, with a difference of 2 points:
1, NetLink has its own address;
2, NetLink received the message with a netlink their own message head;


The process of creating and destroying a socket in user state:
1, with the socket function to create, socket (Pf_netlink, SOCK_DGRAM, netlink_xxx); The first parameter must be Pf_netlink or Af_netlink, and the second parameter should be sock_dgram and Sock_ Raw is not a problem, the third parameter is the NetLink protocol number.
2, bind your own address with the BIND function.
3. Close the socket with closing.
code sample for creating a socket: [CPP] View plain copy {       struct sockaddr_nl addr;        int flags;             //Establish netlink  socket       s_nlm_socket = socket (pf_netlink, sock_dgram,  NETLINK_XXX);       if (s_nlm_socket < 0)         {           use_dbg_out ("Create netlink  socket error.\r\n ");           goto Err_Exit;        }              / /bind       addr.nl_family = PF_NETLINK;        addr.nl_pad    = 0;       addr.nl_pid     = getpid ();       addr.nl_groups = 0;               if (Bind s_nlm_socket,  (struct sockaddr*) & Addr, sizeof (addr)  < 0)        {            use_dbg_out ("bind socket error.\r\n");            goto Err_Exit;       }             //set Socket to non-blocking mode        flags =  Fcntl (s_nlm_socket, f_getfl, 0);       fcntl (S_nlm_socket, F_SETFL ,  flags| O_nonblock);                     return 0;   err_exit:       return -1;  }  

User-State APIs for receiving and sending messages:
The user state uses SendTo to send netlink messages to the kernel and receive messages with Recvfrom. Just note that when sending and receiving, add a NetLink message header before the message that comes with you. For example, define a message communication structure such as the following:
[CPP]View plain copy struct Tag_rcv_buf {struct NLMSGHDR hdr;        NetLink's message head netlink_notify_s my_msg; Communication entity message}st_snd_buf;
example of sending code: [CPP] View plain copy my_send_msg   {       struct tag_rcv_buf       {   &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;STRUCT&NBSP;NLMSGHDR Message headers for  hdr;            //netlink            netlink_notify_s my_msg;      &NBSP;&NBSP;&NBSP;//Communication Entity message        }st_snd_buf;        fd_set st_write_set;               &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;//SELECT&NBSP;FD, avoid thread hanging         struct timeval write_time_out = {10, 0};     &NBSP;//10 sec Timeout        int ret;              //Settings select       fd_zero (&st_write_set);        fd_set (s_nlm_socket, &st_write_set);              /*           setting send data        */       st_snd_buf.hdr.nlmsg_len   = sizeof (st_snd_buf         //nlmsg_length (sizeof (netlink_notify_s))--This macro contains a header         st_snd_buf.hdr.nlmsg_flags = 0;                          Additional options for  /* messages, no use */       st_snd_buf.hdr.nlmsg_type  = 0;                            /* set Custom message type */       st_snd_buf.hdr.nlmsg_pid   =  getpid ();                   /* set the sender's pid*/             st_snd_buf.my_ msg.start_pack_id = s_id;       st_snd_buf.my_msg.end_pack_id    = e_id;              ret =  select (s_nlm_socket+1, null, &st_write_set, null, &write_time_out);        if (ret == -1)        {            //have some error.            use_dbg_out ("send has some error %d.\n",  errno);           goto out;       }       else  if (ret == 0)        {            //Timeout exit            tmp_dbg_out ("send  timeout.\n ");           goto out;        }       else       {           //Receive Message             ret = sendto (s_nlm_socket, &st_snd_buf, sizeof (ST_SND_BUF), 0,                               (struct sockaddr*) &s_peer_addr, sizeof (S_PEER_ADDR));                  if (ret < 0)             {                use_dbg_out ("send to kernal by nl error %d\r\n",  errno);            }            else           {                tmp_dbg_out ("send to kernal ok s_id  is %d, e_id is %d.\r\n ",  s_id, e_id);            }       }          out:           return;  }  
code example for receiving data: [CPP] View plain copy {       struct tag_rcv_buf       {            struct nlmsghdr hdr;    Message headers          of          //netlink &NBSP;&NBSP;&NBSP;NETLINK_NOTIFY_S&NBSP;MY_MSG;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;//Communication Entity Message         }st_rcv_buf;       int ret, addr_len,  io_ret;       struct sockaddr_nl st_peer_addr;        fd_set st_read_set;            &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;//SELECT&NBSP;FD, avoid thread hanging        struct timeval read_time_out = {10, 0};    &nbSP;&NBSP;//10 sec Timeout        int rcv_buf;               //Set the communication address of the kernel        st_peer_addr.nl_family  = AF_NETLINK;       st_peer_addr.nl_pad = 0;                                     /*always set  to zero*/       st_peer_addr.nl_pid = 0;                                     /*kernel ' s pid  is zero*/       st_peer_addr.nl_groups = 0;                                  /*multicast groups mask, if unicast set to  zero*/       addr_len = sizeof (st_peer_addr);           //Settings select       fd_zero (&st_read_set);        fd_set (s_nlm_socket, &st_read_set);           ret = select (S_nlm_socket+1, &st_read_set, null, null, &read _time_out);       if (ret == -1)        {            //have some error.            use_dbg_out ("select rcv some error %d",  errno);           goto err;       }        else if (ret == 0)        {            //Timeout exit             tmp_dbg_out ("rcv timeout.\n");           *p_ size = 0;           goto out;       }       else       {            //Receive Message             ret = recvfrom (S_nlm_socket, &st_rcv_buf, sizeof (ST_RCV_BUF),  0,                  (struct  sockaddr *) &st_peer_Addr, &addr_len);       }             if (ret == sizeof (st_rcv_buf)  )        {            //received the message ...           else       {            use_dbg_out ("Rcv msg have some err. ret is %d, errno is &

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.