Developing and maintaining the kernel is a complex task, so only code that is most important or closely related to system performance is placed in the kernel. Other programs, such as the GUI, the admin and the control part of the code, are generally used as user state programs. In a Linux system, it is common to split a feature of a system into one that is implemented separately in the kernel and in user space (for example, the firewall of a Linux system is divided into kernel-state netfilter and user-state iptables). However, how do kernel programs communicate with user-state programs? The
answer is achieved through a variety of user-state and kernel-state IPC (interprocess communication ) mechanisms. For example, system call, IOCTL interface, proc file system and NetLink socket, this article is to discuss NetLink Socekt and to show the reader the advantages of the IPC mechanism implemented by the Network
Communication interface mode.
Introduction:
NetLink Socekt is a special IPC for data transfer between kernel State and user-state processes. It implements a Full-duplex communication connection by providing the kernel module with a
for a special set of APIs and a set of standard socket interfaces for the user program. Similar to the use of Af_inet address families in TCP/IP, NetLink sockets use the address family Af_netlink. Each NetLink
socket defines its own protocol type in the kernel header file
Include/linux/netlink.h
.
The following is the current collection of attributes for the netlink socket and the type of protocol it supports:
Netlink_route communication channels between routing daemons in user space, such as bgp,ospf,rip and kernel data forwarding modules. The user-state routing daemon updates the routing table in the kernel through this type of protocol. Netlink_firewall: Receives packets sent by IPV4 Firewall code. Netlink_nflog: A channel for communication between the user state's Iptables management tool and the NetFilter module in the kernel. NETLINK_ARPD: Used to manage the ARP table in the kernel from user space. |
Why do the above features use the NetLink method instead of system calls when communicating with user programs and kernel programs IOCTLs
or proc file system. The reason: Adding a new system call to a new feature, IOCTLs or a proc file is not an easy thing to do because we risk polluting the kernel code and potentially damaging the system's stability.
However, the netlink socket is so simple that you only need to add a constant to the file netlink.h to identify your protocol type, and then the kernel module and the user program can communicate immediately using the socket-style API.
NetLink provides an asynchronous way of communication, like other socket APIs, which provides a socket queue to buffer or smooth
Instantaneous message spikes. A system call that sends a NETLINK message triggers the recipient's receive handler function after the message is added to the recipient's message pair column. In the context of the receive handler function, the receiver can decide whether to process the message immediately or put the message in the queue and process it later in other contexts (because we want to receive the processing function as quickly as possible). System calls are different from NetLink and require a synchronous processing, so when we use a system call to pass messages to the kernel from a user's state, the granularity of kernel scheduling can be affected if the message is processed for a long time.
The code that implements the system call in the kernel is statically linked to the kernel at compile time, so it is not appropriate to include a system call in the dynamic loading module, which is the most device-driven approach. When using the NetLink socket, the NetLink program in the dynamic load module does not produce any compile-time dependencies with the NetLink part of the Linux kernel.
NetLink is superior to system calls, and another feature of the IOCTLs and proc file systems is that it supports multicast. A process can pass a message to a NetLink group address, and then any number of processes can listen to that group address (and receive messages). This mechanism provides a near-perfect solution for event distribution of kernel to user state.
System calls and IOCTL are both IPC in a single working way, that is, the initiator of such an IPC session can only be a user state program. But what if the kernel has an urgent message that wants to notify the user state program? If you use these IPC directly, there is no way to do this. Typically, applications periodically poll the kernel for changes in state, however, high-frequency polling is bound to increase the load on the system. NetLink solves this problem perfectly by allowing the kernel to initialize the session, which we call the duplex feature of the netlink socket.
Finally, the NetLink socket provides a set of BSD-style API functions familiar to developers, so the cost of developing training is less expensive than using the arcane system call API or IOCTL.
The relationship with BSD routing sockets
In the BSD TCP/IP protocol stack Implementation, there is a special socket called routing socket. Its address family is Af_route, the protocol family is pf_route, and the socket type is SOCK_RAW. This routing socket is used by the user state process to add or remove routing information to the routing table in the kernel. In the Linux system, the NetLink socket Netlink_route the same function as the Routing socket by protocol type, which can be said that the NetLink socket provides a superset of the BSD Routing socket function.
API for NetLink Socket
The standard socket API function-
Sockets (), sendmsg (), recvmsg (), and Close ()
-can be directly called by the user state program to access the NetLink socket. You can access the man manual to get a detailed definition of these functions. In this article, we only discuss how to select parameters for these functions in the context of the netlink socket. These APIs should be familiar to readers who have written some simple Web programs using TCP/IP sockets.
Using the socket () function to create a socket, enter:
int socket (int domain, int type, int protocol)
|
The socket domain (address family) is a af_netlink,socket type Sock_raw or SOCK_DGRAM, because NetLink is a packet-oriented service.
The protocol type selects NetLink type to use. Here are some of the predefined NetLink protocol types:
Netlink_route, Netlink_firewall, NETLINK_ARPD, Netlink_route6
and NETLINK_IP6_FW.
You can also easily add custom protocol types to the netlink.h.
Each NetLink protocol type can define a group of up to 32 multicast transports. Each group is represented by a bit bit, 1<<i,0<=i<=31.
When a set of user-state processes and kernel-state processes collaborate to implement an identical feature, this approach is useful because NetLink messages that send multicast can reduce the number of system calls and reduce the number of related applications that are intended to handle the load that is associated with maintaining the relationships between multicast groups.
Bind () function
Like the socket in TCP/IP, the NetLink bind () function associates a local socket address (the source socket address) with an open socket, and the NetLink address structure is as follows:
struct SOCKADDR_NL { sa_family_t nl_family; * Af_netlink * * unsigned short nl_pad; /* Zero * * __u32 Nl_pid; /* Process PID/* __u32 nl_groups; /* Mcast Groups Mask * * } nladdr; |
When the above structure is called by the bind () function, the value of the Sockaddr_nl Nl_pid property can be set to the pid,nl_pid of the current process that accesses the netlink socket as the local address of this netlink socket. The application should select a unique 32-bit integer to populate the Nl_pid value.
Nl_pid Formula 1:nl_pid = Getpid ();
The Formula One uses the process PID as the Nl_pid value, if this process only needs one netlink socket of this type protocol, chooses the process PID as the nl_pid is a very natural method.
In other cases, if multiple threads of a process want to create netlink sockets of the same protocol type that belong to each thread, Formula Two can be used to generate NL_PID values for each thread's netlink socket.
Nl_pid Formula 2:pthread_self () << 16 | Getpid ();
In this way, different threads of the same process can obtain different netlink sockets of the same protocol type that belong to them. In fact, even in a separate thread, you might want to create multiple NetLink sockets of the same protocol type. So developers need more ingenuity to create different nl_pid values, but there's not much discussion about how to create several different nl_pid values in this article
If an application wants to receive a netlink message from a specific protocol type to a specified multicast group, the bits of all the receiving groups should be calculated to form the value of the Sockaddr_nl nl_groups field. Otherwise, nl_groups should be set to 0 so that the application can only receive NetLink messages sent to it. After filling the structure nladdr, do the following binding work:
Bind (FD, (struct sockaddr*) &nladdr, sizeof (NLADDR) sends a NETLINK message In order to be able to send a netlink message to the kernel or other user processes, like the sendmsg () function sent by UDP packets, we need another struct struct sockaddr_nl nladdr as the destination address. If this NetLink message is sent to the kernel, the Nl_pid and Nl_groups properties should all be set to 0. If this message is a single point of transmission to another process, the nl_pid should be set to the recipient process pid,nl_groups should be set to 0, assuming the system uses Formula 1. If the message is sent to one or more multicast groups, the nl_groups value should be formed using the bits and operations of all the destination multicast groups. We can then apply the NetLink address to the structure body struct MSGHDR msg for function sendmsg () to invoke:
struct MSGHDR msg; Msg.msg_name = (void *) & (NLADDR); Msg.msg_namelen = sizeof (NLADDR);
|
The NetLink message also needs its own message header, in order to provide a common background for NetLink messages of all protocol types. Since the NetLink part of the Linux kernel always considers that the following headers are already included in each NetLink message body, each application needs to provide this header information before sending the NetLink message:
struct NLMSGHDR { __u32 Nlmsg_len; /* Length of message * * __u16 Nlmsg_type; /* Message type*/ __u16 Nlmsg_flags; * Additional Flags * * __u32 Nlmsg_seq; /* Sequence Number * * __u32 Nlmsg_pid; /* Sending Process PID * * };
|
Nlmsg_len needs to be populated with the total length of the NetLink message body, including header information, which is the NetLink core needs information. Mlmsg_type can be used by applications, and it is a transparent value for the NetLink core. Nsmsg_flags is used for additional control of the message body and is read and updated by the NetLink core code. Nlmsg_seq and Nlmsg_pid are also transparent to the NetLink core, and applications use them to track messages. Therefore, a NetLink message body consists of a nlmsghdr and a payload part of the message. Once you enter a message, it enters a buffer that is pointed to by the NLH pointer. We can also send the message to a structural body struct MSGHDR msg:
struct Iovec Iov; Iov.iov_base = (void *) NLH; Iov.iov_len = nlh->nlmsg_len; Msg.msg_iov = &iov; Msg.msg_iovlen = 1;
|
Once you have completed the above steps, call the Sendmsg () function once to send the NetLink message:
receive NetLink message: The receiving program needs to request large enough space to store the NetLink message header and the payload part of the message. It fills the structure body struct MSGHDR msg in the following manner and then uses the standard function interface Recvmsg () to receive the NetLink message, assuming that NLH points to the buffer:
struct SOCKADDR_NL nladdr; struct MSGHDR msg; struct Iovec Iov;
Iov.iov_base = (void *) NLH; Iov.iov_len = Max_nl_msg_len; Msg.msg_name = (void *) & (NLADDR); Msg.msg_namelen = sizeof (NLADDR);
Msg.msg_iov = &iov; Msg.msg_iovlen = 1; Recvmsg (FD, &msg, 0);
|
When the message is received correctly, NLH should point to the first part of the NetLink message that you just received. The nladdr should contain the destination information received from the body of the message, which consists of the PID and the value of the multicast group to which the message will be sent. The macro definition Nlmsg_data (NLH) in Netlink.h returns a pointer to the payload of the NetLink message body. Call Close (FD) You can turn off the netlink socket represented by the FD descriptor. NetLink API interface for kernel space The NetLink API for kernel space is supported by the NetLink core code in the kernel and implemented in NET/CORE/AF_NETLINK.C. From the kernel's point of view, API interfaces are not the same as user-space APIs. The kernel module accesses the NetLink socket through these APIs and communicates with the user-space program. If you do not want to use the NetLink predefined protocol type, you can add a custom protocol type to the netlink.h. For example, we could add a test protocol type by inserting the following line of code into the Netlink.h: #define Netlink_test 17 You can then access this protocol type in any part of the Linux kernel. In user space, we use socket () calls to create a netlink socket, but in kernel space we call the following API: struct sock * netlink_kernel_create (int unit, void (*input) (struct sock *sk, int len)); Parameter UINT is a NetLink protocol type, such as Netlink_test. function pointer, input, is the callback function pointer to the processing message that the NetLink socket calls when it receives the message. After the kernel creates a netlink_test type of netlink socket, whenever the user program sends a netlink_test type of NetLink message to the kernel, through Netlink_kernel_create () The callback function registered by the function input () will be invoked. Here is an example that implements the input for message processing functions.
void input (struct sock *sk, int len) { struct Sk_buff *skb; struct NLMSGHDR *nlh = NULL; U8 *payload = NULL;
while ((SKB = Skb_dequeue (&sk->receive_queue))!= NULL) { /* Process NetLink message pointed by Skb->data * NLH = (struct NLMSGHDR *) skb->data; Payload = Nlmsg_data (NLH); /* Process NetLink message with header pointed by * NLH and payload pointed by payload */ } }
|
The callback function input () is invoked in the context of the system call SENDMSG () of the sending process. If the input function processes messages quickly, everything is fine. But if processing a netlink message takes a long time, we want to place the processing part of the message outside of the input () function, because a lengthy message processing process may prevent other system calls from entering the kernel. Instead, we can sacrifice a kernel thread to complete the subsequent infinite processing action. Use SKB = Skb_recv_datagram (Nl_sk) to receive messages. Nl_sk is the netlink socket that the Netlink_kernel_create () function returns, and then you just have to handle the NetLink message that the Skb->data pointer points to. This kernel thread sleeps when there is no news in the Nl_sk. So what we're going to do in the callback function input () is the kernel thread that wakes up the sleep, like this:
void input (struct sock *sk, int len) { Wake_up_interruptible (Sk->sleep); } |
This is an upgraded version of the kernel and user space communication model, which improves the granularity of context switching. Sending NetLink messages from the kernel Like sending messages from user space, the kernel also needs to set the source NetLink address and destination NetLink address when sending NetLink messages. Suppose that the structure body struct Sk_buff * SKB points to a buffer that stores the NetLink message to be sent, the source address can be set:
NETLINK_CB (SKB). Groups = Local_groups; NETLINK_CB (SKB). PID = 0; * FROM Kernel * * The destination address can be set like this: NETLINK_CB (SKB). dst_groups = dst_groups; NETLINK_CB (SKB). Dst_pid = Dst_pid; |
This information is not stored in skb->data, instead, they are stored in the NetLink control block SKB of the socket buffer. Send a unicast message using:
int Netlink_unicast (struct sock *ssk, struct Sk_buff *skb, u32 pid, int nonblock);
|
SSK is the NetLink socket that is returned by the Netlink_kernel_create () function, which points to the NetLink message body that needs to be sent, and if the Formula One is used, the PID is the PID of the receiving program, Noblock Indicates whether a failed message should be blocked or returned immediately when the receive buffer is unavailable. You can also send a multicast message from the kernel. The following function sends a NETLINK message to multiple groups identified by the PID-specified process and group.
void Netlink_broadcast (struct sock *ssk, struct Sk_buff *skb, u32 PID, u32 Group, int allocation);
|
The value of a group is the result of the bitwise AND operation of each group that receives the message. Allocation is the kernel memory request type. Typically, gfp_atomic is used in the interrupt context, otherwise the gfp_kernel is used. This is due to the fact that the API may need to request one or more socket buffers and make copies when sending multicast messages. Close NetLink socket from kernel space The NetLink socket returned by the Netlink_kernel_create () function is struct sock *nl_sk, and we can close this netlink socket from the kernel space by accessing the following API: Sock_release (Nl_sk->socket); So far, we have demonstrated the minimum code framework for NetLink programming concepts. We then use the Netlink_test protocol type and assume it has been added to the kernel header file. The kernel module code listed here is only related to NetLink, so you should insert it into a complete kernel module code, so that the complete code can be found in other code. Instance:
Net_link.c
#include <linux/kernel.h> #include <linux/module.h> #include <linux/types.h> #include <linux/sched.h> #include <net/sock.h> #include <net/netlink.h>
#define Netlink_test 21
struct sock *nl_sk = NULL; EXPORT_SYMBOL_GPL (NL_SK);
void Nl_data_ready (struct sk_buff *__skb) { struct Sk_buff *skb; struct NLMSGHDR *nlh; U32 pid; int RC; int len = nlmsg_space (1200); Char str[100];
PRINTK ("Net_link:data is ready to read./n"); SKB = Skb_get (__SKB);
if (Skb->len >= nlmsg_space (0)) { NLH = NLMSG_HDR (SKB); PRINTK ("Net_link:recv%s./n", (char *) nlmsg_data (NLH)); memcpy (Str,nlmsg_data (NLH), sizeof (str)); PID = nlh->nlmsg_pid; /*pid of sending Process * * PRINTK ("Net_link:pid is%d/n", PID); KFREE_SKB (SKB);
SKB = ALLOC_SKB (len, gfp_atomic); if (!SKB) { PRINTK (kern_err "Net_link:allocate failed./n"); Return } NLH = Nlmsg_put (skb,0,0,0,1200,0); NETLINK_CB (SKB). PID = 0; * FROM Kernel * *
memcpy (Nlmsg_data (NLH), str, sizeof (str)); PRINTK ("net_link:going to send./n"); rc = Netlink_unicast (Nl_sk, SKB, PID, msg_dontwait); if (RC < 0) { PRINTK (kern_err "Net_link:can not unicast SKB (%d)/n", RC); } PRINTK ("Net_link:send is ok./n"); } Return }
static int Test_netlink (void) { Nl_sk = Netlink_kernel_create (&init_net, netlink_test, 0, Nl_data_ready, NULL, this_module);
if (!nl_sk) { PRINTK (kern_err "net_link:cannot create NetLink socket./n") |
|