"NetLink" on user space and kernel space communication

Source: Internet
Author: User
Tags sendmsg

Tag: Lex performs interest tag calculation style self header file

Original address: User space and kernel space communication "NetLink" wjlkoorey258

Introduction

Alan Cox NetLink was first introduced in the development phase of the kernel 1.3 release, where the NetLink was a character-driven interface that provided two-way data communication between the kernel and the user space, and then, during the 2.1 kernel development process, Alexey Kuznetsov the netlink into a more flexible, easy-to-extend, message-based communication interface and applies it to the infrastructure of the advanced routing subsystem. Since then, NetLink has become one of the primary means of communication between Linux kernel subsystems and user-configured applications.

In 2001, the forces IETF Committee formally worked on the standardization of NetLink. Jamal Hadi Salim proposes to define NetLink as a protocol for routing engine Components for network devices and for communicating between their control management components. But his proposal was eventually not adopted and replaced by the pattern we saw today: NetLink was designed as a new protocol domain.

Linux's father Tawas once said "Linux is evolution, not intelligent design". What do you mean? In other words, NetLink also follows some of the design concepts of Linux, that is, there is no complete specification document and no design documentation. Only what? You know---"Read the f**king source code".

Of course, this article is not to analyze NetLink on Linux implementation mechanism, but on "what is NetLink" and "How to use Good netlink" topic and everyone to do a share, only in the face of problems only need to read the kernel source to understand why.

What is NetLink

There are several key points that need to be grasped about NetLink's understanding:

1. Non-connected message subsystem for datagram

2, based on the common BSD socket architecture and implementation

About the 1th makes it very easy to think of the UDP protocol, which is great. According to the UDP protocol to understand NetLink is not unreasonable, as long as you can comprehend by analogy, do "live learning", good at summing up inductive, Lenovo, finally realize the knowledge transfer this is the essence of learning. The NetLink can implement bidirectional, asynchronous data communication between the kernel-and user-and the kernel, and it also supports data communication between two user processes and even two of the kernel subsystems. In this article, we do not consider the latter two, focusing on how to achieve the user <-> core data communication.

See the 2nd mind is not an instant flash of the following picture? If it is, then you do have the Hui root, of course, not also does not matter, Hui Root can slowly grow well, hehe.

    in the back of the actual netlink socket programming we will mainly use the socket (), bind (), sendmsg ( )and thesystem calls such as recvmsg (), and, of course, the rotation (polling) mechanism provided by the socket.   

NetLink Type of communication

NetLink supports two types of communication modes: unicast and multicast.

Unicast: Often used for 1:1 of data traffic between a user process and a kernel subsystem. The user space sends the command to the kernel and then accepts the command's return result from the kernel.

Multicast : Data communication that is often used for 1:n between a kernel process and multiple user processes. The kernel acts as the initiator of the session, and the user-space application is the recipient. To achieve this, kernel-space programs create a multicast group, and then all of the user-space processes that are interested in sending messages to that kernel process are joined to the group to receive messages sent from the kernel. as follows:    where process A and subsystem 1 are unicast communication, process B, C, and subsystem 2 are multicast traffic. A message was also presented to us. The data passed from the user space to the kernel does not need to be queued, that is, its operation is synchronous, while the transfer of data from the kernel space to the user space needs to be queued and asynchronous. Understanding this can lead us to a lot less detours when developing NetLink-based application modules. If you send a message to the kernel that requires some information in the kernel, such as a routing table, or other information, if the routing table is too large, then the kernel will return data to you through NetLink, and you can figure out how to receive the data, after all you've seen the output queue, Can't turn a blind eye to AH.

Message format for NetLink

The NetLink message consists of two parts: a message header and a valid data payload, and the entire NetLink message is 4-byte aligned, typically delivered by host byte order. The message header is a fixed 16 bytes and the message body length is variable:

NetLink's message header

The message header is defined in the <include/linux/netlink.h> file and is represented by the struct NLMSGHDR:

Click (here) to collapse or open

    1. struct NLMSGHDR
    2. {
    3. __u32 Nlmsg_len; /* Length of message including header */
    4. __u16 Nlmsg_type; /* Message content */
    5. __u16 Nlmsg_flags; /* Additional Flags */
    6. __u32 Nlmsg_seq; /* Sequence Number */
    7. __u32 Nlmsg_pid; /* Sending process PID */
    8. };

Explanation and description of each member property in the message header:

Nlmsg_len: The length of the entire message, measured in bytes. Includes the NetLink message header itself.

Nlmsg_type: The type of message, that is, the data or the control message. Currently (kernel version 2.6.21) NetLink only supports four types of control messages, as follows:

nlmsg_noop-empty message, do nothing;

nlmsg_error-indicates that the message contains an error;

nlmsg_done-If the kernel returns more than one message through the NetLink queue, the last message of the queue is of type Nlmsg_done, and the Nlmsg_flags property of all the remaining messages is set to Nlm_f_multi bit valid.

The nlmsg_overrun-is temporarily useless.

Nlmsg_flags: Additional explanatory information attached to the message, such as the nlm_f_multi mentioned above. Excerpts are as follows:

Mark

Role and description

Nlm_f_request

If the token bit is in the message, it indicates that this is a request message. All messages from the user space to the kernel space are set, otherwise the kernel will return an error that einval invalid parameters to the user

Nlm_f_multi

Messages from the user----kernel are synchronized immediately, while the user from the kernel----will need to be queued. If the kernel previously received a message from a user with a nlm_f_dump bit of 1, the kernel sends a linked list of multiple NetLink messages to the user space. This bit is set in each of the remaining messages except for the last one.

Nlm_f_ack

The message is the response of the kernel to the Nlm_f_request message from the user space

Nlm_f_echo

If the token is 1 in a message sent to the kernel from user space, the user's application process requires the kernel to send each message that the user sends to it to the user process in unicast form. Similar to what we usually call "echo" functionality.

...

...

    as long as you know nlmsg_flags have a variety of values can be, as for each value of the role and meaning, through Google and the source code must be able to find the answer, here is not launched. All the values in the previous 2.6.21 kernel:

NLMSG_SEQ: Message sequence number. Because NetLink is datagram-oriented, there is a risk of data loss, but NetLink provides a mechanism for ensuring that messages are not lost, so that program developers can implement them according to their actual needs. The message sequence number is generally used in conjunction with the Nlm_f_ack type of message, if the user's application needs to ensure that each message it sends is successfully received by the kernel, then it needs the user program to set the sequence number when it sends the message, and the kernel receives the sequence number. The same serial number is then set in the response message sent to the user program. A bit similar to the TCP response and acknowledgement mechanism.

Note: When the kernel actively sends a broadcast message to the user space, the field in the message is always 0.

Nlmsg_pid: When a channel of data exchange is established between the process of the user space and a subsystem of the kernel space through NetLink, NetLink assigns a unique digital ID to each such channel. Its primary role is to correlate the request message and the response message from the user space. To be blunt, if there are multiple user processes in the user space and there are multiple processes in the kernel space, NetLink must provide a mechanism to ensure that data interactions between each process of "user-kernel" space communication are not disrupted.     that is, when process A and B obtain information through NetLink to subsystem 1, subsystem 1 must ensure that the response data sent back to process A is not sent to process B. A scenario that primarily applies to user-space processes that fetch data from the kernel space. Typically, user-space processes send messages to the kernel by calling Getpid () to assign the process number of the current process to the variable, that is, if the process of the user space wants to get a response from the kernel. The field is set to 0 for messages that are actively sent from the kernel to the user space.  

The message body of NetLink

the NetLink message body is in TLV (type-length-value) format: NetLink Each property is represented by the struct nlattr{} in the <include/linux/netlink.h> file:

Error indication message provided by NetLink

when there is an error communicating through NetLink between the application of the user space and the process of the kernel space, the NetLink must communicate the error to the user space. NetLink the error message is individually encapsulated, <INCLUDE/LINUX/NETLINK.H>:

Click (here) to collapse or open

    1. struct NLMSGERR
    2. {
    3. int error; The standard error code is defined in the Errno.h header file. You can use Perror () to explain
    4. struct NLMSGHDR msg; Indicates which message triggered the error value in the struct body
    5. };

Issues needing attention in NetLink programming

NetLink-based user-kernel communication, there are two situations that can result in packet loss:

1, memory exhaustion;

2. Buffer overflow of user space receive process. The main reason for a buffer overflow could be that the process of user space is running too slowly, or the receive queue is too short.

If NetLink cannot properly pass the message to the receiving process of the user space, it is important to note that the receiving process for user space will return an out-of-memory (ENOBUFS) error when calling the Recvmsg () system call. In other words, the case of a buffer overflow is not sent in the sendmsg () system call from the user----the kernel, as we have said before, please think for yourselves.

Of course, if the use of blocking socket communication, there is no memory depletion of the hidden dangers, which is why? Hurry to Google a bit, check what is blocking socket bar. Learning without thinking is not the case, but thinking without learning is dangerous.

Address structure of NetLink

In the TCP blog post we mentioned the address structure and the standard address structure used in the Internet programming process, and their relationship to the NetLink address structure is as follows:

struct sockaddr_nl{} the detailed definitions and descriptions are as follows:

Click (here) to collapse or open

    1. struct SOCKADDR_NL
    2. {
    3. sa_family_t nl_family; /* This field is always af_netlink */
    4. unsigned short nl_pad; /* Not currently used, filled as 0*/
    5. __u32 Nl_pid; /* Process PID */
    6. __u32 nl_groups; /* Multicast Groups Mask */
    7. };

Nl_pid: This property is the process ID for sending or receiving messages, as we have said earlier, NetLink can not only implement user-kernel space communication, but also enable real user space to communicate between two processes, or two processes in kernel space. This property is 0 when generally applies to the following two scenarios:

First, we want to send the destination is the kernel, that is, from the user space to the kernel space, we construct the NetLink address structure in the body of the nl_pid is usually set to 0. Here is a little need to explain to you, in the NetLink specification, PID full name is Port-id (32bits), its main function is to uniquely identify a netlink-based socket channel. Typically, the nl_pid is set to the process number of the current process. However, for cases where multiple threads of a process use the netlink socket simultaneously, the settings for Nl_pid are generally implemented as follows:

Click (here) to collapse or open

    1. Pthread_self () << 16 | Getpid ();

Second, when the multi-broadcast text from the kernel to the user space, if the process of user space in the multicast group, then its address structure in the Nl_pid is also set to 0, but also with the following described in the other properties.

Nl_groups: If a user-space process wants to join a multicast group, the BIND () system call must be performed. This field indicates the mask of the multicast group number that the caller wants to join (note that the field is not the group number, which we'll explain in detail later). If the field is 0, the caller does not want to join any multicast groups. For each protocol that is subordinate to the NetLink protocol domain, up to 32 multicast groups can be supported (because nl_groups is 32 bits long), and each multicast group is represented by a bit.

About NetLink the rest of the knowledge points, we are in the back of the practical link useful to discuss again.

not finished, to be continued ...

"NetLink" on user space and kernel space communication

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.