Implementation of an available stateless bidirectional static NAT module on Linux

Source: Internet
Author: User
Tags explode goto sprintf

There is a lot of information on how to configure Nat on Linux, which is overwhelming! This article has nothing to do with this. This article provides a way outside of iptables.
Iptables No! Why Because the NAT configured by Iptables is stateful, its implementation relies on a module called Conntrack, what is Conntrack? Oh,no! This is my specialty, but I don't want to say it in this article, people who know me know that I can pull this topic for 12 hours ... They're not even finished. You may not know what a stateful NAT is, but if you are a conscientious, or a skilled Linux network administrator or enthusiast, you must have encountered such problems when configuring NAT, such as " When a connection has been established, why does the NAT not take effect in time "," why the data can only be actively sent from one direction to another in the iptables after configuring Nat ". This is the state in the mischief, you know, the IP is stateless, but Nat joins the logic of the fourth layer after the state, which is stateful NAT, that is, iptables-t Nat. Configure the nature of the NAT inherent, you can not change. At least the NAT I see in the latest version of Iptables is still stateful. There are times when.
Sometimes, you may, you must ...
You must configure a stateless NAT, bidirectional, static. This question, alas.
This problem toss me half a year, 2013 the first 9 months, a let me happy let me worry about three quarters, my energy almost all flutter in one thing, from winter to 40+ degrees Celsius high temperature, from 6:30 A.M. to work to 10 o ' clock in the night more still stay in the computer room ... If not the day before yesterday to tidy up the bookshelf found a at that time has not been reimbursed 120 yuan of the routine overtime to play tickets, I did not want to write this module. 120 block is nothing, but take this opportunity to recall the past, by the way, to fill the part of the incomplete, is to own reimbursement, and value far more than 120 yuan. I have to admit, that three quarters is not stateless NAT is the most important, I took it out a year later today, because the other problems were I overcome at that time, no matter how long it took, there was a 72-hour cry to solve the conntrack confirm problem , there has been a chaotic impatience and strange women to eat barbecue together by the wife to cheat the truth ... But this stateless NAT is always unresolved, not resolved, why?
What I do is a product rather than a personal test, I am in a company team instead of doing private work, all the technology used must be pre-developed technology, to ensure the feasibility, more importantly, to ensure that all people in the same rhythm, perhaps parallel, the big melody is always a one, this is playing Canon AH. I can not add some personal color, such as personal whim (later I confess, this point I did not do well!) ), if you want to join, you must have the following process:
Bring leaders, team owners into meetings, training, and ensure that any developer and technical person knows the technology they are using without leaving a corner.
But where is this time?! Human nature is fragile, no matter how strong the life. Anyone can suddenly hang up, because of the reasons for the non-resistance to other cities or countries, and not familiar with the colleague because of the water to fight a quit ... If you're doing something that's gone, for a company, it means you've never been there. ...... It's a little far away.
So, even though I could have implemented stateless NAT at the time (I used the route target way, maybe I have written before), I can not use, I can only find the opportunity to find a leisure afternoon, a no urgent task of the afternoon, Let all the relevant people understand the technology, and then use and not is a requirement problem, the best way in addition to code or code, for programmers, said 1G characters are nonsense, no running code is a rip, even if the code is bad, it doesn't matter, because of such a realistic character, I like this profession, does not fahen words, does not have the breadth theory, does not fight, does not incite, as long as the code can run up, only then.
By the way, stateless Nat has a good implementation in Linux 2.4, which is done using Tc/policy routing, but it's hard to do with the 2.6 kernel. The author has left this module to the person who really wants to implement it, based on the decoupling considerations, perhaps I'm a. The reason why I do so many other people seems meaningless, because I want to express a concept, that is "development operations" and "operation and development" concept, such people must be the hottest person in the future. When I faced the numerous criticisms of Cisco Certified Engineers, my first thought was not to scold them or to beat them (no one is rich in vocabulary and [Note: it cannot be used here or ', to use ' and ') to beat others ' embarrassment, even if these are not problems, there is no law ...), my first thought was, On Linux can not achieve the same function, the purpose is not to explode their chrysanthemums, but let him think I can explode him. Fortunately, I did it, and when I came home I would write a few modules and test it, just like the company's technology pre-research. Then presented to friends, interested can go to test, this kind of thing no money to earn, also not to be sure, useless to git, is not God horse GPL open source, is a special circle of friends to share, I hate classification, I very much like casually. Then, then, as the little words, I coded, wrote a bunch of rotten code, Linux based on netfilter implementation of a bidirectional, static, stateless NAT, code is not complex, only hundreds of lines, but ....
However, there are two problems:
1. This module is primary, but usable, this is my self-affirmation side;
2. This module has a lot of room for suspicious improvement, I am self-denying.

The code is as follows:
/* * Usage: * Target address translation for packets with target address 1.2.1.2, target to 192.168.1.8 * echo +1.2.1.2 192.168.1.8 DST >/proc/net/static_nat * The above command will be added simultaneously Add a reverse snat map * * Please explain: * echo +192.168.184.250 192.168.184.154 src >/proc/net/static_nat * */#include <linux/module.h > #include <linux/skbuff.h> #include <net/ip.h> #include <net/netfilter/nf_conntrack.h> #define Dirmask0x11#define buckets1024#define nat_opt_del0x01#define nat_opt_find0x04#define NAT_OPT_ACCT_BIT0x02enum nat_ Dir {dir_snat,dir_dnat,dir_num};/* * record statistics */struct nat_account {u32 nat_packets;u32 nat_bytes;}; struct Static_nat_entry {__be32 addr[dir_num];enum nat_dir type;struct nat_account acct[dir_num];struct hlist_node node [Dir_num];}; Static Define_spinlock (Nat_lock);/* Save Snat Mappings */struct hlist_head *src_list;/* save Dnat mappings */struct hlist_head *dst_list;/* * Use an IP address (for prerouting is daddr, for postrouting is saddr) as key to get value. */static __be32 get_address_from_map (struct sk_buff *skb, unsigned int dir, __be32 addr_key, unsigned int opt) {__be32 ret = 0, Cmp_key, ret_value;u32 hash;struct hlist_head *list;struct hlist_node *iter, *tmp;struct static_nat_entry *ent; hash = Jhash_1word (Addr_key, 1), hash = Hash%buckets;spin_lock (&nat_lock), if (dir = = dir_dnat) {list = &dst_list[h Ash];} else if (dir = = dir_snat) {list = &src_list[hash];} else {spin_unlock (&nat_lock); goto out;} Hlist_for_each_safe (ITER, TMP, list) {ent = hlist_entry (iter, struct static_nat_entry, Node[dir]);/* Note reverse */cmp_key = (ent ->type = = dir)? ent->addr[0]:ent->addr[1];ret_value = (Ent->type = = dir)?ent->addr[1]:ent->addr[0]; if (Addr_key = = Cmp_key) {ret = ret_value;if (opt = = Nat_opt_del) {if (dir = = Ent->type) {Hlist_del (&ent->node[0 ]); Hlist_del (&ent->node[1]); Kfree (ENT); else {ret = 0;}} if (opt & nat_opt_acct_bit) {ent->acct[dir].nat_packets ++;ent->acct[dir].nat_bytes + = SKB = = NULL?1:skb-> Len;} break;} }spin_unlock (&nat_lock); out:return ret;} /* * Update the checksum information on layer fourth */static void Nat4_update_l4 (strUCT Sk_buff *skb, __be32 Oldip, __be32 newip) {struct IPHDR *iph = IP_HDR (SKB); void *transport_hdr = (void *) IPH + Ip_hdrle N (SKB); struct TCPHDR *tcph;struct udphdr *udph;bool cond;switch (iph->protocol) {Case IPPROTO_TCP:TCPH = Transport_ Hdr;inet_proto_csum_replace4 (&tcph->check, SKB, Oldip, Newip, true); Break;case ipproto_udp:case IPPROTO_ UDPLITE:UDPH = Transport_hdr;cond = udph->check! = 0;cond |= skb->ip_summed = checksum_partial;if (cond) {Inet_prot O_csum_replace4 (&udph->check, SKB, Oldip, Newip, True); if (Udph->check = = 0) {Udph->check = Csum_mangled_0;} }break;}}  /* * Perform source address translation on postrouting: * 1. Forward source address translation; * 2. Reverse source address translation of destination address translation */static unsigned int ipv4_nat_out (unsigned int hooknum, struct Sk_buff *skb, const struct Net_device *in, const struct Net_device *out, int (*OKFN) (struct Sk_buff *)) {unsigned int ret = Nf_accept;__be32 To_trans = 0;struct Iphdr *hdr = IP_HDR (SKB); To_trans = Get_address_from_map (SKB, DIR_SNAT, Hdr->sad Dr, nat_opt_find| Nat_opt_acct_bIT); if (!to_trans) {goto out;} if (hdr->saddr = = To_trans) {goto out;} /* Execute Snat */csum_replace4 (&hdr->check, hdr->saddr, To_trans); Nat4_update_l4 (SKB, hdr->saddr, To_trans); HDR-&GT;SADDR = To_trans;out:return ret;} /* * Perform destination address translation on prerouting: * 1. Forward destination address translation; * 2. Reverse Destination address translation of source address conversion */static unsigned int ipv4_nat_in (unsigned int hooknum, s Truct sk_buff *skb, const struct Net_device *in, const struct Net_device *out, int (*OKFN) (struct Sk_buff * ) {unsigned int ret = nf_accept;__be32 To_trans = 0;struct Iphdr *hdr = IP_HDR (SKB); if (SKB-&GT;NFCT && skb->nf CT! = &nf_conntrack_untracked.ct_general) {goto out;} To_trans = Get_address_from_map (SKB, Dir_dnat, Hdr->daddr, nat_opt_find| Nat_opt_acct_bit); if (!to_trans) {goto out;} if (hdr->daddr = = To_trans) {goto out;} /* Execute Dnat */csum_replace4 (&hdr->check, hdr->daddr, To_trans); Nat4_update_l4 (SKB, hdr->daddr, To_trans); HDR-&GT;DADDR = to_trans;/* * Set a notrack to prevent it from being track and Nat. * This is absolutely right,Because it is static stateless NAT * We do not want it to be around the state **//* * In fact, is not mainly to avoid conntrack-based NAT, because * Conntrack itself does not allow you to the two-direction of the tuple can be arbitrarily modified * * if (!SKB-&GT;NFCT) {skb->nfct = &nf_conntrack_untracked.ct_general;skb->nfctinfo = IP_CT_NEW;nf_conntrack_ Get (SKB-&GT;NFCT);} Out:return ret;} static struct Nf_hook_ops ipv4_nat_ops[] __read_mostly = {{. hook= ipv4_nat_in,.owner= this_module,.pf= NFPROTO_IPV4,. hooknum= nf_inet_pre_routing,.priority= nf_ip_pri_conntrack-1,},{.hook= ipv4_nat_out,.owner= THIS_MODULE,.pf= nfproto_ipv4,.hooknum= nf_inet_post_routing,.priority= nf_ip_pri_conntrack+1,},};static Char *parse_addr (const char *input, __be32 *from, __be32 *to) {char *p1, *p2;size_t length = strlen (input); P1 = MEMCHR (Input, "', length)") {return NULL;} if (! ( P2 = MEMCHR (p1 + 1, ", Length-(p1 + 1-input))) {return NULL;} if (! ( In4_pton (Input, P1-input, (U8 *) from, ', NULL)) | | ! (In4_pton (p1 + 1, p2-p1-1, (U8 *) to, ", NULL))) {return NULL;} return ++P2;} Static ssize_t Static_nat_config_write(struct file *file, const char *buffer, size_t count, loff_t *unused) {int ret = 0;size_t length = count;__be32 from, to;u32 normal, Reverse;char *buf = Null;char *p;struct static_nat_entry *e Nt;if (length) {Char *pp = (char *) (buffer + (length-1)); (*pp < (char) 32) | | (*pp > (char) 126); pp--) {if (length <= 0) {ret =-einval;goto out;} length--;}} else {goto out;} BUF = Kzalloc ((length + 1), gfp_atomic), if (!buf) {ret =-enomem;goto out;} memcpy (buf, buffer, length); p = parse_addr (buf + 1, &from, &to)) {ret =-einval;goto out;} if (' + ' = = *buf) {ent = (struct static_nat_entry *) kzalloc (sizeof (struct static_nat_entry), gfp_kernel); if (!ent) {ret =- Efault;goto out;} /* Calculates the hash bucket position of the original item */normal = Jhash_1word (from, 1); normal = normal%buckets;/* calculates the position of the hash bucket in reverse position */reverse = Jhash_1word (to, 1); Reverse = reverse%buckets;/* * Set Key/value to * Note that the reversal type of hnode its key/value also reverses */ent->addr[0] = from;ent->addr[1] = to; /* Initialize the linked list node */init_hlist_node (&ent->node[dir_snaT]); Init_hlist_node (&ent->node[dir_dnat]), if (Strstr (p, "src")) {/* Add Snat items, automatically generate DNAT items *//* first determine if */if is already present ( Get_address_from_map (NULL, Dir_snat, from, nat_opt_find) | | Get_address_from_map (NULL, Dir_snat, to, Nat_opt_find)) {ret =-eexist;kfree (ENT); goto out;} /* This is the entry type, used to differentiate the generated two configuration items */ent->type = dir_snat;/* implemented into the list */spin_lock (&nat_lock); Hlist_add_head (& Ent->node[dir_snat], &src_list[normal]); Hlist_add_head (&ent->node[dir_dnat], &dst_list[reverse ]); Spin_unlock (&nat_lock);} else if (STRSTR (p, "DST")) {/* Add DNAT items, automatically generate Snat items *//* first determine if */if (Get_address_from_map (NULL, Dir_dnat, from, nat_opt _find) | | Get_address_from_map (NULL, Dir_dnat, to, Nat_opt_find)) {ret =-eexist;kfree (ENT); goto out;} /* This is the entry type, used to differentiate the generated two configuration items */ent->type = dir_dnat;/* implemented into the list */spin_lock (&nat_lock); Hlist_add_head (& Ent->node[dir_dnat], &dst_list[normal]); Hlist_add_head (&ent->node[dir_snat], &src_list[reverse ]); Spin_unlock (&nat_lock);} else {ret =-efault;kfree (ENT); goto out;}}  else if ('-' ==*buf ') {u32 r1;if (strstr (P, "src")) {r1 = Get_address_from_map (NULL, Dir_snat, from, Nat_opt_del); if (r1 = = 0) {ret =-enoent;goto out;}} else if (STRSTR (p, "DST")) {r1 = Get_address_from_map (NULL, Dir_dnat, from, Nat_opt_del); if (r1 = = 0) {ret =-enoent;goto o UT;}} else {}} else {ret =-einval;goto out;} ret = Count;out:kfree (BUF); return ret;} Static ssize_t static_nat_config_read (struct file *file, char __user *buf, size_t count, loff_t *ppos) {int len = 0;static int done = 0;int I;char from[15], To[15];char *kbuf_to_avoid_user_space_memory_page_fault = null;/* Maximum length of each row */#define Max _line_chars128if (done) {do = 0;goto out;} /* Allocation of a piece of kernel memory, in order to avoid direct operation of user memory and raise the page scheduling, * page scheduling will cause sleep switching, and we operate the content in the spin-lock protection *, so can not switch! *//* * Problem: * I only allocate a count size of memory here, because this version does not support multiple reads, * can only be read once. Maybe I should learn the method of SEQ read. */kbuf_to_avoid_user_space_memory_page_fault = Kzalloc (count, Gfp_kernel); if (!kbuf_to_avoid_user_space_memory_ Page_fault) {len =-enomem;done = 1;goto oUT;} Spin_lock (&nat_lock); Len + sprintf (kbuf_to_avoid_user_space_memory_page_fault + len, "Source Trans table:\n"); if (len + max_line_chars > Count) {goto Copy_now;} for (i = 0; i < BUCKETS; i++) {struct Hlist_node *iter, *tmp;struct static_nat_entry *ent;hlist_for_each_safe (ITER, TMP , &src_list[i]) {ent = hlist_entry (iter, struct static_nat_entry, Node[dir_snat]); sprintf (from, "%pi4", ent-> Type = = Dir_snat)? &AMP;ENT-&GT;ADDR[0]:&AMP;ENT-&GT;ADDR[1]); sprintf (To, "%pi4", (Ent->type = = Dir_snat)? &ent->addr[1]: &ent->addr[0]); Len + sprintf (kbuf_to_avoid_user_space_memory_page_fault + len, "From:%-15s to:%-15s [%s] [Byte S:%u packet:%u]\n ", From, to, (Ent->type = = Dir_snat)?" STATIC ":" AUTO ", ent->acct[dir_snat].nat_bytes,ent->acct[dir_snat].nat_packets); if (len + max_line_chars > Count) {goto Copy_now;}} }len + = sprintf (Kbuf_to_avoid_user_space_memory_page_fault + len, "\ndestination Trans table:\n"); if (len + max_line_ CHARS > Count) {Goto Copy_now;} for (i = 0; i < BUCKETS; i++) {struct Hlist_node *iter, *tmp;struct static_nat_entry *ent;hlist_for_each_safe (ITER, TMP , &dst_list[i]) {ent = hlist_entry (iter, struct static_nat_entry, Node[dir_dnat]); sprintf (from, "%pi4", ent-> Type = = Dir_dnat)? &AMP;ENT-&GT;ADDR[0]:&AMP;ENT-&GT;ADDR[1]); sprintf (To, "%pi4", (Ent->type = = Dir_dnat)? &ent->addr[1]: &ent->addr[0]); Len + sprintf (kbuf_to_avoid_user_space_memory_page_fault + len, "From:%-15s to:%-15s [%s] [Byte S:%u packet:%u]\n ", From, to, (Ent->type = = Dir_dnat)?" STATIC ":" AUTO ", ent->acct[dir_dnat].nat_bytes,ent->acct[dir_dnat].nat_packets); if (len + max_line_chars > Count) {goto Copy_now;}} }copy_now:spin_unlock (&nat_lock);d one = 1;/* here has lifted the spin lock */if (Copy_to_user (buf, Kbuf_to_avoid_user_space_memory_ Page_fault, Len)) {len = Efault;goto out;} Out:if (Kbuf_to_avoid_user_space_memory_page_fault) {kfree (kbuf_to_avoid_user_space_memory_page_fault);} return Len;} static const struct FILE_operations static_nat_file_ops = {. owner= this_module,.read= static_nat_config_read,.write= Static_nat_config_ write,};static int __init nf_static_nat_init (void) {int ret = 0;int i;src_list = kzalloc (sizeof (struct hlist_head) * BUCKET S, Gfp_kernel); if (!src_list) {ret =-enomem;goto out;} Dst_list = kzalloc (sizeof (struct hlist_head) * BUCKETS, Gfp_kernel); if (!dst_list) {ret =-enomem;goto out;} ret = Nf_register_hooks (Ipv4_nat_ops, Array_size (ipv4_nat_ops)); if (Ret < 0) {PRINTK ("Nf_nat_ipv4:can ' t register Hooks.\n "); goto out;} if (!proc_create ("Static_nat", 0644, Init_net.proc_net, &static_nat_file_ops)) {ret =-enomem;goto out;} for (i = 0; i < BUCKETS; i++) {init_hlist_head (&src_list[i]); Init_hlist_head (&dst_list[i]);} Return Ret;out:if (src_list) {kfree (src_list);} if (dst_list) {kfree (dst_list);} return ret;} static void __exit Nf_static_nat_fini (void) {int i;remove_proc_entry ("Static_nat", init_net.proc_net); nf_unregister_ Hooks (Ipv4_nat_ops, array_size (ipv4_naT_ops)); Spin_lock (&nat_lock); for (i = 0; i < BUCKETS; i++) {struct Hlist_node *iter, *tmp;struct static_nat_entry * Ent;hlist_for_each_safe (ITER, TMP, &src_list[i]) {ent = hlist_entry (iter, struct static_nat_entry, node[0]); hlist_ Del (&ent->node[dir_snat]); Hlist_del (&ent->node[dir_dnat]); Kfree (ENT); }spin_unlock (&nat_lock); if (src_list) {kfree (src_list);} if (dst_list) {kfree (dst_list);}} Module_init (Nf_static_nat_init); Module_exit (Nf_static_nat_fini); Module_description ("STATIC two-way NAT"); Module_author ("[email protected]"); Module_license ("GPL");

Makefile:

Obj-m + = nf_rawnat.oall:make-c/lib/modules/' uname-r '/build subdirs= ' pwd ' modulesclean:rm-rf *.ko *.o. Tmp_versions. * . mod.o. *.o.cmd *.mod.c. *.ko.cmd module.symvers Modules.order

I do not like the application of private work does not mean that I do not love, does not mean that I do not. After all, the work has been very tired, why continue to be tired ah. After work, what else is there to do? Stateless NAT ... This is a lot of people do not get involved, I sigh, I am sad, micro-people, my Who and return? ...

Implementation of an available stateless bidirectional static NAT module on Linux

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.