Improvement of Linux system NAT implementation mechanism

Source: Internet
Author: User
Tags goto header range

A little grumbling and hope

I have been very dissatisfied with Linux NAT, and also wrote the "How the Linux system smooth effective NAT" series of articles in the patch to repair, but also wrote some of the class Cisco implementation of the patch, but the effect is not good, torrential rain night, the last long vacation the second night, Although there is no October 7 night rainfall is big, but October 6 night to 7th early morning, Shanghai Jiading Yonder rain can also be called rainstorm. I've been wanting to see it, but I haven't had time to watch it. The third quarter of Spartacus has finally finished, the bigger the rain, the more excited, but Barabasi's "link" also read the end, "The story of the Romans," the last volume is finished, "Black Swan" has not arrived, the rest of the only write point code ... So half a bottle of bamboo snake accompanied me to the day dawn, modified several kernel code files, debug a few hours, nap two hours later, get up to buy fresh meat and vegetables and seafood, because the last day of the long holiday to eat hotpot together at home. Hot pot is very cool, outside Gros, the room is steaming ... So it rained all the way down to the next morning.

October 7 The first day of work, I go out of normal, but to the company has more than 12, when on the subway, wading to the knees, turn, suddenly found dark yellow floating things, the stench of oncoming, a stranger to the safety of a pedestrian in the together, I in the first array ... I heard that there was a problem in the nearby toilet, sewage feces from the underground flooded up ... Move on, stop, or turn around? If I am the only one, I am sure to turn back, however, there are two mm behind, but also quite fashionable and beautiful, said to the past, and then wash, and another one should be dressed in a suit of the man because trousers too fit can not pull up, determined to go to the past again ... How can I be so sloppy, imagine ... Really don't want to go to this muddy water!! ...

I am too wordy about so many trifles that have nothing to do with work. Getting to the point!

The role of alloc_null_binding

Linux NAT implementation is based on Ip_conntrack, this sentence has not known how many times. Everything is implemented in the Netflter hook function, its logic is not complex, but a small point of interest, and that is: even if there is no NAT-independent data stream matching, and also to the implementation of a null_binding, the so-called Null_ Binding is using its original source IP address and destination IP address to construct a range, and then based on this range to do the conversion, which seems to be a useless thing, actually really useful.

Where is the use? Note that null_binding only does not change the IP address, its port may be changed. Why would you want to change the port of a NAT-independent data stream? Because the data stream associated with NAT may have been consumed by a port in a NAT-independent data stream for the uniqueness of the five tuple, this affects the uniqueness of the five-tuple of data streams that are not related to NAT. Since Ip_conntrack is not distinguished by NAT, and the NAT operation is to change the five-tuple, the five-tuple of the entire conntrack is unique, even if only one data stream performs NAT, it may also occupy a five-tuple element of some other data stream, triggering a ripple effect, So all to perform uniqueness detection and update, alloc_null_binding is to do this operation.

Completely eliminate the concept of stream header matching NAT

If you don't delve into Linux NAT, just configure it, maybe you don't really know that NAT rules only work for a single stream, specifically, when a ip_conntrack structure that is only for one stream has just been created and has not confirm. Because sometimes the ip_conntrack structure is outdated. As long as such a package leaves the stack, the stream is confirm, and then other packets belonging to the same stream directly use the NAT results stored in the IP_CONNTRACK structure of the packet.

Because of this feature, you cannot add NAT rules in the middle of the way to immediately take effect or modify existing NAT results. This stateful feature poses a lot of problems. Previously wrote the "How the Linux system smooth effective NAT" series of articles, made some correction patches. The problem with these patches, however, is that they are still minor fixes based on the stream header matching the NAT rules. We know that this small repair the final result is not maintenance, then why not to a subversion, that is, no longer use the stream head to match the principle of Nat, to think when the matching on what the principle of matching. This is actually a higher level of subversion, that is, the flow-head matching principle is a special case of the new matching principle.

After abolishing the stream header matching principle, I decided to leave the decision on when to perform Nat to the application, so I decided to register a SYSCTL variable, which performs NAT when it's not 0 o'clock, whether it's confirm or not.

When do I need to match NAT rules

Since the flow header matching principle is not good and can cause problems (such as the problem of confirm connection being deadlocked there without NAT), it is important to point out when NAT matching is necessary, which is called a break-and-stand. It is necessary to perform NAT in the following situations:

1. Data stream connection, because has not done NAT and lead to a long time not connected to the situation. At this time the CT state of the data stream is still new;

2. The data stream has been successfully connected, but the source address needs to be changed (changing the destination address means reconnecting to a new service). The TC state of the data stream at this time is established;

3. The data stream has been NAT connected, but the NAT rules have changed. The TC state of the data stream at this time is established;

Which situations do not perform NAT

Not all of these situations are appropriate for NAT matching to perform NAT, and we need to consider not only the ip_conntrack itself, but also the semantics of the Protocol itself. Let's take a look at the TCP protocol, because TCP maintains an existing connection strictly based on a five-tuple, modifying any factor means that the connection no longer exists. So:

1. With connection 4-layer protocols such as TCP, only a new state of data flow can perform NAT, and a non new state means that the feedback from the target has been received, and the implementation of NAT is meaningless;

2. A stream of one of the packets has been done NAT, and the NAT rule has not changed, at this time the reverse five-tuple has been changed, there is no need to match every time the NAT rules table;

Can do and can't do

For what can be done, in general you can not do it, that is, you can do or not, but for what can not be done, the basic is strictly prohibited, if you do, it will bring serious consequences or even if there is no serious consequences is completely useless work, the world is so asymmetrical, sometimes point to, always work not to come! So for the above two bars, ' When you need to match some of the NAT rules ' points, I gave control to the application, so I exported a sysctl interface, which is controlled by the kernel for situations in which I can't perform NAT.

Code implementation

All the above implementation of the words is the code, I did not put the standard patch posted to the article, because that is the time to play patch to the program to see, if people look, a lot of +++---certainly very disturbing sight, so I changed a way, that is,///////////////////// Surrounded by the code snippet added for me,/////////////####### #包围的为我修改的代码段. The structure of this section is:

{{{{filename \ n code snippet \ n General description},...} :

Include/net/netfilter/nf_nat.h

Avoid Ip_conntrack_status enumeration members, but 13 is a heavyweight number.

#define Nf_force_nat_bit 13

Description: Added a new CT status to indicate whether to do NAT matching.

Include/net/netfilter/nf_conntrack_l4proto.h

struct Nf_conntrack_l4proto

{

...

Int (*can_force_nat) (struct nf_conn *ct, struct sk_buff);

...

}

Description: The Nf_conntrack_l4proto structure adds a Can_force_nat callback function that gives the 4-tier protocol the discretion to determine whether to perform NAT again, rather than in Ip_conntrack and Nat logic.

Net/netfilter/nf_conntrack_proto_tcp.c

//////////////////////////

static int Nf_ct_can_force_nat (struct nf_conn *ct, struct sk_buff)

{

Nothing to say ...

return 1;

}

//////////////////////////

...

struct Nf_conntrack_l4proto NF_CONNTRACK_L4PROTO_TCP4 __read_mostly =

{

...

//////////////////////////

. Can_force_nat = Nf_ct_can_force_nat,

//////////////////////////

...

};

Description: Added the Nf_ct_can_force_nat callback function to indicate that NAT cannot be rerun in the establish state.

Net/ipv4/netfilter/nf_nat_standalone.c

//////////////////////////

#ifdef CONFIG_SYSCTL

Increased user-State Sysctl interface, located in/proc/sys/net/ipv4/netfilter/nf_force_nat

static struct Ctl_table_header *nat_sysctl_header;

static unsigned int Nf_force_nat __read_mostly = 0;

static struct ctl_table nf_nat_sysctl_table[] = {

{

. procname = "Nf_force_nat",

. data = &nf_force_nat,

. maxlen = sizeof (unsigned int),

. Mode = 0644,

. Proc_handler = Proc_dointvec_jiffies,

},

{

. Ctl_name = 0

}

};

#endif

//////////////////////////

...

static unsigned int

NF_NAT_FN (unsigned int hooknum,

struct Sk_buff *skb,

const struct Net_device *in,

const struct Net_device *out,

Int (*OKFN) (struct Sk_buff *))

{

...

//////////////////////////

#ifdef CONFIG_SYSCTL

if (Nf_force_nat!=0) {

Set_bit (Nf_force_nat_bit, &ct->status);

} else {

Clear_bit (Nf_force_nat_bit, &ct->status);

}

#else

Clear_bit (&ct->status);

#endif

//////////////////////////

Switch (ctinfo) {

Case ip_ct_related:

...

Case Ip_ct_new:

/////////////########

Add a Label

Renat:

/* Seen it before? This can happen for loopback, Retrans,

or local packets ... */

Add a possibility to allow NAT

if (!nf_nat_initialized (CT, maniptype) | | | test_bit (nf_force_nat_bit, &ct->status))

/////////////########

{

unsigned int ret;

...

Default

* Established * *

Nf_ct_assert (Ctinfo = = Ip_ct_established | |

Ctinfo = = (ip_ct_established+ip_ct_is_reply));

//////////////////////////

if (Test_bit (Nf_force_nat_bit, &ct->status)) {

struct Nf_conntrack_l3proto *l3proto;

struct Nf_conntrack_l4proto *l4proto;

unsigned int dataoff;

u_int8_t Protonum;

int ret;

L3proto = __nf_ct_l3proto_find (Nfproto_ipv4);

ret = L3proto->get_l4proto (SKB, Skb_network_offset (SKB),

&dataoff, &protonum);

L4proto = __nf_ct_l4proto_find (Nfproto_ipv4, protonum);

/**

* It should actually be up to the four-tier protocol itself to determine whether NAT can be enforced,

* But that's going to change the conn-level callback.

*/

if (L4proto->can_force_nat = NULL | |

!l4proto->can_force_nat (CT, SKB)) {

Goto Renat;

}

}

//////////////////////////

}

...

}

...

static int __init nf_nat_standalone_init (void)

{

...

//////////////////////////

#ifdef CONFIG_SYSCTL

Nat_sysctl_header = Register_sysctl_paths (Nf_net_ipv4_netfilter_sysctl_path, nf_nat_sysctl_table);

if (Nat_sysctl_header = = NULL) {

PRINTK ("Nf_nat_init:can ' t Register Nat_sysctl");

Goto Cleanup_rule_init;

}

#endif

//////////////////////////

return ret;

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.