Insight into Linux netfilter&iptables: what is NetFilter?

Source: Internet
Author: User

I study Linux firewall system also has a period of time, because recently involved in the work is more and more, shy ripe. Take the time to sum up the things in this area. One is to make a precipitate, and secondly also welcome the comparison of cattle to the elder brother to give advice, common learning, common progress.

The person who can mix on CU is not daogaoyizhang. Therefore, the younger brother here explains: This series of blog mainly focuses on the analysis of NetFilter implementation Mechanism, principles and design of the ideological level of things, while from the user state of the iptables to the kernel netfilter its interactive process and communication means. As for Iptables's introductory usage, there's a whole bunch of stuff on the web that I don't waste.

A lot of people feel this way after touching iptables: How does every rule that I go through the iptables command come into effect? How does the kernel perform these rule matches? If Iptables doesn't meet my immediate needs, can I extend it? These questions are the topics that I share with you in the next blog post. It needs to be noted here: Because the NetFilter is seamlessly aligned with the IP stack, if you have a basis for the protocol stack, you will feel pro when reading this article. Of course, if not, it doesn't matter, because I will be at the key point on the introduction of the stack of knowledge to do a universal. Just popularize oh, will not go into detail, because there are too many things involved, at present I am still studying in the midst of it. Well, the nonsense is not much to say, get into the chase.

Note: The kernel version I studied is version 1.4.0 of 2.6.21,iptables.

What is NetFilter?

To illustrate this problem, first look at a basic model of network communication:

In the process of sending data, from top to bottom is the process of "adding a head", each arrival of a layer of data will be added to the head of the layer; At the same time, the receiving data is a "stripping the head" process, from the network card after the packet, in the upper layer of the protocol stack in turn stripped each layer of head, The bare data is finally reached by the user.

Then the underlying mechanism of the "stack" mode is basically the following:

For each packet received, from the "A" point in, through the routing decision, if it is sent to the local through the "B" point, and then continue to the upper deck of the protocol stack, otherwise, if the destination of the packet is not native, then go through the "C" point, and then along the "E" points the packet forward.

For each packet sent, there is also a routing decision to determine which interface the package is going out from, then passes through the "D" point, and finally the packets is sent out along the "E" points.

The five key points of the protocol stack a,b,c,d and e are where we netfilter.

NetFilter is a subsystem introduced by Linux 2.4.x, which serves as a generic, abstract framework that provides a set of management mechanisms for hook functions, making it possible to track such things as packet filtering, network address translation (NAT), and protocol type-based connection tracking. The NetFilter is located in the kernel as shown in:

This diagram, which is very intuitive, reflects the relationship between the iptables of the user space and the NetFilter-based ip_tables module of the kernel space and the way it communicates, as well as the role NetFilter plays in it.

Go back to the five key points discussed earlier about the protocol stack, "ABCDE". NetFilter in the netfilter_ipv4.h the five points to re-name, as shown, meaning I will no longer explain, Cat called Mimi only:

At each key point, there are many callback functions that have been pre-registered by priority (and what these functions are, and what they are for.) Some people like to call these functions "hook function", say the same thing) ambush at these key points, forming a chain. Each incoming packet is then "molested" by those callback functions and then, depending on the situation, it is released, discarded, or dripped. However, these callback functions must finally report to netfilter about the life and death of the packet, because after all, each packet is netfilter from the people's agreement to the loan to the brothers happy, and how to drop also have to "live to see people, dead to see Corpse" bar. Each hook function must finally return one of the following values to the NetFilter framework:

n Nf_accept continues to transmit the datagram normally. This return value tells NetFilter that the packet has been accepted so far and that the packet should be submitted to the next stage of the network protocol stack.

n Nf_drop discards the datagram and no longer transmits it.

The N Nf_stolen module takes over the datagram and tells NetFilter to "forget" the datagram. The callback function will begin processing the packet from this point, and netfilter should discard any processing of the packet. However, this does not mean that the resource for the packet has been freed. This packet and its own SK_BUFF data structure are still valid, but the callback function obtains the ownership of the packet from NetFilter.

n nf_queue queues The datagram (typically used to process the datagram to the user's space)

N Nf_repeat call the callback function again, you should use this value sparingly to avoid causing a dead loop.

To make us look more professional, we started to make a pact: we call them hook points after the five key points mentioned above, and the callback functions registered by each hook point are called hook functions.

Linux version 2.6 kernel NetFilter currently supports IPV4, IPV6, and DECnet stacks, where we mainly study the IPV4 protocol. About the protocol type, Hook point, hook function, priority, through the following diagram to give you a detailed display:

For each type of protocol, the packet is transmitted sequentially in the direction of the hook point, and a number of hook functions are netfilter on each hook point and in priority. These hook functions are used to handle data packets.

NetFilter uses nf_hook(INCLUDE/LINUX/NETFILTER.H) macros to cut into the netfilter frame inside the protocol stack. Compared to version 2.4, the version 2.6 kernel is more flexible in the definition of the macro, as defined below:

#define Nf_hook(PF, HOOK, SKB, Indev, Outdev, OKFN) \

Nf_hook_thresh (PF, Hook, SKB, Indev, Outdev, OKFN, int_min)

Explanation of each parameter of macro Nf_hook:

1) PF: The protocol family name, the NetFilter schema can also be used outside the IP layer, so this variable can have names such as Pf_inet6,pf_decnet.

2) The name of the Hook:hook point, for the IP layer, is to take the above five values;

3) SKB: not explained;

4) Indev: The data packet comes in the device, the struct net_device structure expresses;

5) Outdev: Data packet out of the device, in a struct net_device structure;

(as you can see later, the above five parameters are passed to the handler registered in Nf_register_hook.) )

6) OKFN: is a function pointer, and when all the registration functions of the hook point have been called, go to this process.

And Nf_hook_thresh is also a macro:

#define Nf_hook_thresh(PF, HOOK, SKB, Indev, Outdev, OKFN, thresh) \

({int __ret; \

if (__ret=nf_hook_thresh(PF, Hook, & (SKB), Indev, Outdev, OKFN, Thresh, 1)) = = 1) \

__ret = (OKFN) (SKB); \


We found that the Nf_hook_thresh macro only added a thresh parameter, which is used to specify the priority of traversing the hook function through the macro, and the Nf_hook_thresh function is called inside the macro:

static inline int Nf_hook_thresh (int pf, unsigned int hook,

struct Sk_buff **pskb,

struct Net_device *indev,

struct Net_device *outdev,

Int (*OKFN) (struct Sk_buff *), int thresh,

int cond)


if (!cond)

return 1;

#ifndef Config_netfilter_debug

if (List_empty (&nf_hooks[pf][hook]))

return 1;


Return Nf_hook_slow (PF, Hook, PSKB, Indev, Outdev, OKFN, Thresh);


This function adds only one parameter cond, which is 0 to discard the traversal, and does not execute the OKFN function, and 1 executes the Nf_hook_slow to complete the sequential traversal of the OKFN of the hook function (priority is executed from small to large).

A two-dimensional structure array is defined in the Net/netfilter/core.h file to store callback handler functions for different protocol stack hooks.

struct List_head nf_hooks[nproto][nf_max_hooks];

Among them, the number of rows Nproto is 32, that is, the largest protocol cluster supported by the current kernel, the number of columns Nf_max_hooks is the number of mount points, currently in the 2.6 kernel, the value is 8. The final structure of the Nf_hooks array is shown.

In Include/linux/socket.h the IP protocol af_inet (pf_inet) is ordinal 2, so we can get the TCP/IP protocol family hook function mount point is:






At the same time, we see that in the IP stack of the 2.6 kernel, from the normal process of the protocol stack into the NetFilter framework, and then sequentially, sequentially call each hook point all the hook functions related operations are as follows:

1), the IP_RCV function in net/ipv4/ip_input.c. This function is mainly used to deal with the ingress function of IP packets in the network layer, which is to the NetFilter framework's entry point:

Nf_hook (Pf_inet, nf_ip_pre_routing, SKB, Dev, null,ip_rcv_finish)

According to the previous understanding, the meaning of this code is already very intuitive and clear. That is: If the protocol stack currently receives an IP message (pf_inet), then send this to the NetFilter nf_ip_pre_routing filter to check [R] at that filter point (nf_hooks[2][0]) Whether someone has registered the associated hook function for processing the packet. If there is one, go through the list nf_hooks[2][0] to find the matching match and the corresponding target, depending on the value returned to the NetFilter frame, further determine what to do with the packet (handled by the hook module or by ip_rcv_ The finish function continues processing).

[R]: Just talking about the so-called "check". The core is the Nf_hook_slow () function. The function essentially does something very simple, based on the priority lookup doubly linked list nf_hooks[][], find the corresponding callback function to process the packet:

struct List_head * *i;

List_for_each_continue_rcu (*i, head) {

struct nf_hook_ops *elem = (struct nf_hook_ops *) *i;

if (Hook_thresh > elem->priority)


Verdict = elemHook(Hook, SKB, Indev, Outdev, OKFN);

if (verdict! = nf_accept) {...}

return nf_accept;


The above code is part of the core code of the Nf_iterate () function in net/netfilter/core.c, which is called by the Nf_hook_slow function and then processed further based on its return value.

2), net/ipv4/ip_forward.c in the Ip_forward function, its entry point is:

Nf_hook (Pf_inet, Nf_ip_forward, SKB, Skb->dev, rt->,ip_forward_finish);

After routing, all messages that require native forwarding will be processed by the Ip_forward function. Here, the function is cut into the netfilter frame by the Nf_ip_foward filter point, and the matching lookup is performed at the nf_hooks[2][2] filter point. Finally, the execution of the Ip_forward_finish function is determined based on the return value.

3), net/ipv4/ip_output.c in the Ip_output function, it cut into the form of NetFilter frame:

Nf_hook_cond (Pf_inet, nf_ip_post_routing, SKB, NULL, Dev,ip_finish_output,

! (IPCB (SKB)->flags & ipskb_rerouted));

Here we see the pointcut changed from unconditional macro nf_hook to conditional macro Nf_hook_cond, which is called if the protocol stack does not have a rerouting token in the packet SKB currently being processed, the packet will enter the NetFilter framework. Otherwise directly call the Ip_finish_output function to go to the protocol stack to deal with. In addition, there are no other differences between conditional macros and unconditional macros.

If you need to fall into the netfilter frame, the packet will be searched at the nf_hooks[2][4] filter point.

4), or the Ip_local_deliver function in net/ipv4/ip_input.c. The function handles all packets whose destination address is native, and its cut-in function is:

Nf_hook (Pf_inet, nf_ip_local_in, SKB, Skb->dev, null,ip_local_deliver_finish);

The packets sent to this machine, first of all go to nf_hooks[2][1] filter point to detect whether there is a callback handler for the relevant packet, if there is a match and action, and then execute the Ip_local_deliver_finish function based on the return value.

5), the Ip_push_pending_frames function in net/ipv4/ip_output.c. The function is to reassemble the IP shard into a complete IP packet and send it out. Entry points into the NetFilter framework are:

Nf_hook (Pf_inet, Nf_ip_local_out, SKB, NULL, Skb->dst->dev, dst_output);

For all messages sent from this machine will first go to NetFilter's nf_hooks[2][3] filter point to filter. In general, no matter whether it is a router or a PC-side, few people restrict the messages they send to their machines. Because the potential risk of doing so is also obvious, it is often because some inappropriate settings cause some services to fail, so it is very rare to intercept packets at this filter point. Of course, it does not rule out a situation where there are special needs.

Section: The hook mechanism of the NetFilter framework throughout the Linux kernel can be summarized as follows:

Throughout the process of packet flow through the kernel protocol stack, pre_routing, local_in, FORWARD, Local_out, and Post_routing are based on the protocol cluster of the packet at some predefined key points pf_ Inet to these key points to find out if there is a hook function registered. If not, return directly to the function that the OKFN function pointer points to to continue the protocol stack, and if so, call the Nf_hook_slow function to go into the NetFilter framework to further invoke the hook function registered under the filter point, It then determines whether to continue executing the function pointed to by the function pointer Okfn, based on its return value.

Not finished, to be continued ...

Insight into Linux netfilter&iptables: what is NetFilter?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.