Implement an available stateless bidirectional static NAT module on Linux, statelessnat
There are already a lot of documents on how to configure NAT on Linux! This article has nothing to do with it. This document provides a method other than iptables.
Iptables? No! Why? Because the NAT configured by iptables is stateful, its implementation depends on a module called conntrack. What is conntrack? Oh, NO! This is my specialty, but I don't want to talk about it in this article. Anyone who knows me knows that I can talk about this topic for 12 hours. Maybe you don't know what stateful NAT is, but if you are a Linux network administrator or enthusiast with excellent technology, you must have encountered such a problem when configuring NAT, for example, "Why cannot I configure NAT when a connection has been established ", "Why does iptables send data from one direction to another after NAT configuration. This is how the state works. You know, the IP address does not have the state, but after NAT is added to the layer-4 logic, the state is available. This is stateful NAT, that is, iptables-t nat .. you cannot change the nature of the configured NAT. At least the nat I see in the latest version of iptables is still stateful. Sometimes ..
Sometimes, you may, you must...
You must configure a stateless NAT, bidirectional, and static. This problem, alas ..
This problem has plagued me for half a year. In the first nine months of 2013, I spent almost all my energy on one thing for the three quarters that made me happy and worried, from the cold winter to the high temperature of more than 40 degrees Celsius, from half past six am to go to work until midnight, more than 10 still stay in the data center... I didn't want to write this module if I found a routine overtime ticket for 120 yuan that was not reimbursed when I cleaned up the shelves the day before yesterday. 120 RMB is nothing, but I would like to take this opportunity to recall the past. By the way, the incomplete part is reimbursed for myself, and the value is far greater than 120 RMB. I have to admit that stateless NAT is not the most important in those three quarters. The reason why I took it out one year later is that I overcome all other problems, no matter how long it takes, there were 72 hours of surprises to solve the conntrack confirm problem, and there were cases where my wife cheated the truth by eating barbecue with strangers due to chaos and impatience... but this stateless NAT has never been solved, but it has not been solved. This is why?
I am working on a product rather than a personal experiment. I am working on a company team instead of a private employee. All technologies used must be pre-developed to ensure feasibility. More importantly, we must ensure that everyone is at the same pace. Maybe it is parallel, but the big melody is always one. I can't add some personal colors, such as personal whimsy (I confess later, I am not doing well !), If you want to join, you must have the following process:
Bring leaders and team owners into meetings and train them to ensure that any R & D personnel and technical personnel are familiar with the technology they use.
But how can this time be reached ?! Human nature is fragile, no matter how strong life is. Anyone may suddenly go down one day. Because of the force majeure, they have gone to another city or country, and some unfamiliar colleagues have left the company due to boiling water... if you leave because of what you did, it means that you have never been here for a company ....... A little too far.
Therefore, even if I was able to implement stateless NAT at the time (I used the routing target method at the time, maybe I have written it in my previous blog), I cannot use it, I can only look for opportunities, find a free afternoon, or an afternoon without urgent tasks, so that all people related to this will understand this technology, and then use or use is a demand problem, the best method is code or code. For programmers, the 1g character is all nonsense, and the code that fails to run is nonsense. Even if the code is so bad, it doesn't matter, I like this profession because of such a realistic character. I don't talk about it, I don't fight, I don't know what to do, as long as the code can run.
By the way, stateless NAT has been well implemented in Linux 2.4 by using tc/policy routing, but it is hard to implement it in the 2.6 kernel. The author left this module to those who really want to implement it based on the consideration of decoupling. Maybe I will calculate it. The reason why I do so many things that seem meaningless to others is that I want to express the concept of "Development and O & M" and "O & M and development, such people must be the hottest people in the future. When I faced numerous Cisco certified engineers, my first thought was not to scold them or beat them (no one had a lot of words and [Note: I cannot use 'or' here ', it would be embarrassing to use 'and'] To beat others. Even if these are not problems, is there any legal ...), my first thought was that I could not achieve the same functions in Linux, but not to blow their chrysanthemums, but to make him feel that I could blow him up. Fortunately, I have done it. At this moment, I will write several modules at home. However, the test is just like the pre-development of the company's technology. Then we will present it to our friends. Anyone who is interested can test it. This kind of thing has no money to earn, and cannot be affirmed. It doesn't use git, it's not the Shenma GPL open source, it's a special circle of friends to share, I hate classification, and I like classification very much. Then, as small as possible, I coded and wrote a bunch of bad code. on Linux, Netfilter-based implementation is a bidirectional, static, stateless NAT, and the code is not complex, there are only several hundred rows, ....
However, there are two problems:
1. This module is basic but available. This is my self-affirmation;
2. This module has a lot of room for improvement and I am self-denying it.
The Code is as follows:
/*** Usage: * converts the destination address of a packet whose destination address is 1.2.1.2, the target is converted to 192.168.1.8 * echo + 1.2.1.2 192.168.1.8 dst>/proc/net/static_nat *. The preceding command adds a reverse SNAT ing. ** please explain: * echo + 192.168.184.250 192.168.184.154 src>/proc/net/static_nat **/# include <linux/module. h> # include <linux/skbuff. h> # include <net/ip. h> # include <net/netfilter/nf_conntrack.h> # define DIRMASK0x11 # define BUCKETS1024 # define NAT_OPT_DEL0x01 # define NAT_OPT_FIND0x 04 # define NAT_OPT_ACCT_BIT0x02enum nat_dir {DIR_SNAT, DIR_DNAT, DIR_NUM};/** record statistics */struct nat_account {u32 nat_packets; u32 nat_bytes ;}; struct static_nat_entry {__ be32 addr [DIR_NUM]; enum nat_dir type; struct nat_account acct [DIR_NUM]; struct hlist_node node [DIR_NUM] ;}; static DEFINE_SPINLOCK (nat_lock ); /* Save the SNAT ing */struct hlist_head * src_list;/* Save the DNAT ing */struct hlist_head * dst_list;/** use an IP address (PREROUTING is daddr, and POSTROUTING is saddr) as the key to get the value. */Static _ be32 get_address_from_map (struct sk_buff * skb, unsigned int dir, _ be32 addr_key, unsigned int opt) {__ be32 ret = 0, cmp_key, ret_value; u32 hash; struct hlist_head * list; struct hlist_node * iter, * tmp; struct static_nat_entry * ent; hash = jhash_1word (addr_key, 1); hash = hash % BUCKETS; spin_lock (& nat_lock ); if (dir = DIR_DNAT) {list = & dst_list [hash];} else if (dir = DIR_SNAT) {list = & src_lis T [hash];} else {spin_unlock (& nat_lock); goto out;} hlist_for_each_safe (iter, tmp, list) {ent = hlist_entry (iter, struct static_nat_entry, node [dir]);/* pay attention to reverse */cmp_key = (ent-> type = dir )? Ent-> addr [0]: ent-> addr [1]; ret_value = (ent-> type = dir )? Ent-> addr [1]: ent-> addr [0]; if (addr_key = cmp_key) {ret = ret_value; if (opt = NAT_OPT_DEL) {if (dir = ent-> type) {hlist_del (& ent-> node [0]); hlist_del (& ent-> node [1]); kfree (ent) ;}else {ret = 0 ;}} if (opt & NAT_OPT_ACCT_BIT) {ent-> acct [dir]. nat_packets ++; ent-> acct [dir]. nat_bytes + = skb = NULL? 1: skb-> len;} break;} spin_unlock (& nat_lock); out: return ret ;} /** update the verification code information for layer-4 */static void nat4_update_l4 (struct sk_buff * skb, _ be32 oldip, _ be32 newip) {struct iphdr * iph = ip_hdr (skb); void * transport_hdr = (void *) iph + ip_hdrlen (skb); struct tcphdr * tcph; struct udphdr * udph; bool cond; switch (iph-> protocol) {case IPPROTO_TCP: tcph = transport_hdr; inet_proto_csum_replace4 (& tcph-> check, skb, oldip, Newip, true); break; case IPPROTO_UDP: case IPPROTO_UDPLITE: udph = transport_hdr; cond = udph-> check! = 0; cond | = skb-> ip_summed = CHECKSUM_PARTIAL; if (cond) {inet_proto_csum_replace4 (& udph-> check, skb, oldip, newip, true ); if (udph-> check = 0) {udph-> check = CSUM_MANGLED_0 ;}} break ;}/ ** perform source address conversion on POSTROUTING: * 1. forward source address conversion; * 2. reverse source address conversion */static unsigned int limit 4_nat_out (unsigned int hooknum, struct sk_buff * skb, const struct net_device * in, const struct net_device * out, int (* okfn) (struct sk _ Buff *) {unsigned int ret = NF_ACCEPT ;__ be32 to_trans = 0; struct iphdr * hdr = ip_hdr (skb); to_trans = get_address_from_map (skb, DIR_SNAT, hdr-> saddr, NAT_OPT_FIND | NAT_OPT_ACCT_BIT); if (! To_trans) {goto out;} if (hdr-> saddr = to_trans) {goto out;}/* execute SNAT */csum_replace4 (& hdr-> check, hdr-> saddr, to_trans); nat4_update_l4 (skb, hdr-> saddr, to_trans); hdr-> saddr = to_trans; out: return ret ;} /** execute the target address translation on PREROUTING: * 1. forward target address conversion; * 2. reverse target address conversion for source address conversion */static unsigned int limit 4_nat_in (unsigned int hooknum, struct sk_buff * skb, const struct net_device * in, const struct net_device * out, int (* Okfn) (struct sk_buff *) {unsigned int ret = NF_ACCEPT ;__ be32 to_trans = 0; struct iphdr * hdr = ip_hdr (skb ); if (skb-> nfct & skb-> nfct! = & Nf_conntrack_untracked.ct_general) {goto out;} to_trans = get_address_from_map (skb, DIR_DNAT, hdr-> daddr, NAT_OPT_FIND | NAT_OPT_ACCT_BIT); if (! To_trans) {goto out;} if (hdr-> daddr = to_trans) {goto out;}/* execute DNAT */csum_replace4 (& hdr-> check, hdr-> daddr, to_trans); nat4_update_l4 (skb, hdr-> daddr, to_trans); hdr-> daddr = to_trans;/** set a notrack to prevent it from being tracked and nat. * This is absolutely appropriate, because since it is a static stateless NAT * we do not want it to be in the Left or Right status. ** // ** in fact, it is not necessary to avoid conntrack-based NAT, because * conntrack itself does not allow you to modify tuple in both directions */if (! Skb-> nfct) {skb-> nfct = & nf_conntrack_untracked.ct_general; skb-> nfctinfo = IP_CT_NEW; nf_conntrack_get (skb-> nfct);} out: return ret ;} static struct nf_hook_ops limit 4_nat_ops [] _ read_mostly = {{. hook = ipv4_nat_in ,. owner = THIS_MODULE ,. pf = NFPROTO_IPV4 ,. hooknum = NF_INET_PRE_ROUTING ,. priority = NF_IP_PRI_CONNTRACK-1 ,},{. hook = ipv4_nat_out ,. owner = THIS_MODULE ,. pf = NFPROTO_IPV4 ,. hooknum = NF_INET_POST_ROUTING,. Priority = NF_IP_PRI_CONNTRACK + 1 ,},}; static char * parse_addr (const char * input, _ be32 * from, _ be32 * to) {char * p1, * p2; size_t length = strlen (input); if (! (P1 = memchr (input, '', length) {return NULL;} if (! (P2 = memchr (p1 + 1, '', length-(p1 + 1-input) {return NULL;} if (! (In4_ton (input, p1-input, (u8 *) from, '', NULL) |! (In4_ton (p1 + 1, p2-p1-1, (u8 *) to, '', NULL) {return NULL;} return ++ p2 ;} static ssize_t static_nat_config_write (struct file * file, const char * buffer, size_t count, loff_t * unused) {int ret = 0; size_t length = count ;__ be32 from,; u32 normal, reverse; char * buf = NULL; char * p; struct static_nat_entry * ent; if (length) {char * pp = (char *) (buffer + (length-1); for (; (* pp <(char) 32) | (* pp> (Char) 126); pp --) {if (length <= 0) {ret =-EINVAL; goto out;} length -- ;}} else {goto out ;} buf = kzarloc (length + 1), GFP_ATOMIC); if (! Buf) {ret =-ENOMEM; goto out;} memcpy (buf, buffer, length); if (! (P = parse_addr (buf + 1, & from, & to) {ret =-EINVAL; goto out;} if ('+' = * buf) {ent = (struct static_nat_entry *) kzarloc (sizeof (struct static_nat_entry), GFP_KERNEL); if (! Ent) {ret =-EFAULT; goto out;}/* calculate the hash bucket location of the original item */normal = jhash_1word (from, 1); normal = normal % BUCKETS; /* calculate the hash bucket location of the reverse position */reverse = jhash_1word (to, 1); reverse = reverse % BUCKETS;/** set key/value pair * Note, the key/value of the hnode of the Reverse type also needs to be reversed */ent-> addr [0] = from; ent-> addr [1] =; /* initialize the linked list node */INIT_HLIST_NODE (& ent-> node [DIR_SNAT]); INIT_HLIST_NODE (& ent-> node [DIR_DNAT]); if (strstr (p, "src") {/* add an SNAT entry to automatically generate a DNAT entry * // * First, determine whether there is already */if (get_address_from_map (NULL, DIR_SNAT, from, NAT_OPT_FIND) | get_address_from_map (NULL, DIR_SNAT, to, NAT_OPT_FIND )) {ret =-EEXIST; kfree (ent); goto out;}/* this is the type of this entry, which is used to differentiate the two generated configuration items */ent-> type = DIR_SNAT; /* implement it to the linked list */spin_lock (& nat_lock); hlist_add_head (& ent-> node [DIR_SNAT], & src_list [normal]); hlist_add_head (& ent-> node [DIR_DNAT], & dst_list [reverse]); spin_unlock (& nat_lock);} els E if (strstr (p, "dst") {/* Add a DNAT entry, automatically generate an SNAT entry * // * First, determine whether an existing SNAT entry */if (get_address_from_map (NULL, DIR_DNAT, from, NAT_OPT_FIND) | get_address_from_map (NULL, DIR_DNAT,, NAT_OPT_FIND) {ret =-EEXIST; kfree (ent); goto out;}/* this is the type of this entry, used to differentiate the two generated configuration items */ent-> type = DIR_DNAT;/* implemented to the linked list */spin_lock (& nat_lock); hlist_add_head (& ent-> node [DIR_DNAT], & dst_list [normal]); hlist_add_head (& ent-> node [DIR_SNAT], & s Rc_list [reverse]); spin_unlock (& nat_lock);} else {ret =-EFAULT; kfree (ent); goto out ;}} else if ('-' = * buf) {u32 r1; if (strstr (p, "src") {r1 = get_address_from_map (NULL, DIR_SNAT, from, NAT_OPT_DEL ); if (r1 = 0) {ret =-ENOENT; goto out;} else if (strstr (p, "dst") {r1 = get_address_from_map (NULL, DIR_DNAT, from, NAT_OPT_DEL); if (r1 = 0) {ret =-ENOENT; goto out ;}} else {}} else {ret =-EINVAL; goto Out;} ret = count; out: kfree (buf); return ret;} static ssize_t static_nat_config_read (struct file * file, char _ user * buf, size_t count, loff_t * ppos) {int len = 0; static int done = 0; int I; char from [15], to [15]; char * kbuf_to_avoid_user_space_memory_page_fault = NULL; /* Maximum length of each row */# define MAX_LINE_CHARS128if (done) {done = 0; goto out;}/** allocate a kernel memory, to avoid page scheduling caused by direct operation of user memory, * Page scheduling will lead to sleep switching, and the operated content is protected by spin locks * So you cannot switch! * // ** Problem: * Only Memory of the count size is allocated here because this version does not support multiple reads. * only one read is allowed. Maybe I should learn the seq read method. */Kbuf_to_avoid_user_space_memory_page_fault = kzarloc (count, GFP_KERNEL); if (! Dependencies) {len =-ENOMEM; done = 1; goto out;} spin_lock (& nat_lock); len + = sprintf (kbuf_to_avoid_user_space_memory_page_fault + len, "Source trans table: \ n "); if (len + MAX_LINE_CHARS> count) {goto copy_now;} for (I = 0; I <BUCKETS; I ++) {struct hlist_node * iter, * tmp; struct static_nat_entry * ent; hlist_for_each_safe (iter, tmp, & src_list [I]) {ent = hlist_entry (iter, st Ruct static_nat_entry, node [DIR_SNAT]); sprintf (from, "% pI4", (ent-> type = DIR_SNAT )? & Ent-> addr [0]: & ent-> addr [1]); sprintf (to, "% pI4", (ent-> type = DIR_SNAT )? & Ent-> addr [1]: & ent-> addr [0]); len + = sprintf (kbuf_to_avoid_user_space_memory_page_fault + len, "From: %-15 s: %-15 s [% s] [Bytes: % u Packet: % u] \ n ", from, to, (ent-> type = DIR_SNAT )? "STATIC": "AUTO", ent-> acct [DIR_SNAT]. nat_bytes, ent-> acct [DIR_SNAT]. nat_packets); if (len + MAX_LINE_CHARS> count) {goto copy_now ;}} len + = sprintf (rows + len, "\ nDestination trans table: \ n "); if (len + MAX_LINE_CHARS> count) {goto copy_now;} for (I = 0; I <BUCKETS; I ++) {struct hlist_node * iter, * tmp; struct static_nat_entry * ent; hlist_for_each_safe (iter, tmp ,& Dst_list [I]) {ent = hlist_entry (iter, struct static_nat_entry, node [DIR_DNAT]); sprintf (from, "% pI4", (ent-> type = DIR_DNAT )? & Ent-> addr [0]: & ent-> addr [1]); sprintf (to, "% pI4", (ent-> type = DIR_DNAT )? & Ent-> addr [1]: & ent-> addr [0]); len + = sprintf (kbuf_to_avoid_user_space_memory_page_fault + len, "From: %-15 s: %-15 s [% s] [Bytes: % u Packet: % u] \ n ", from, to, (ent-> type = DIR_DNAT )? "STATIC": "AUTO", ent-> acct [DIR_DNAT]. nat_bytes, ent-> acct [DIR_DNAT]. nat_packets); if (len + MAX_LINE_CHARS> count) {goto copy_now ;}} copy_now: spin_unlock (& nat_lock); done = 1; /* The spin lock has been removed here */if (copy_to_user (buf, kbuf_to_avoid_user_space_memory_page_fault, len) {len = EFAULT; goto out;} out: if (condition) {kfree (kbuf_to_avoid_user_space_memory_page_fault);} return len;} s Tatic const struct file_operations static_nat_file_ops = {. owner = THIS_MODULE ,. read = static_nat_config_read ,. write = static_nat_config_write,}; static int _ init nf_static_nat_init (void) {int ret = 0; int I; src_list = kzarloc (sizeof (struct hlist_head) * BUCKETS, GFP_KERNEL ); if (! Src_list) {ret =-ENOMEM; goto out;} dst_list = kzarloc (sizeof (struct hlist_head) * BUCKETS, GFP_KERNEL); if (! Dst_list) {ret =-ENOMEM; goto out;} ret = nf_register_hooks (%4_nat_ops, ARRAY_SIZE (%4_nat_ops); if (ret <0) {printk ("nf_nat_ipv4: can't register hooks. \ n "); goto out;} if (! Proc_create ("static_nat", 0644, init_net.proc_net, & static_nat_file_ops) {ret =-ENOMEM; goto out;} for (I = 0; I <BUCKETS; I ++) {INIT_HLIST_HEAD (& src_list [I]); INIT_HLIST_HEAD (& dst_list [I]);} return ret; out: if (src_list) {kfree (src_list);} if (dst_list) {kfree (dst_list);} return ret;} static void _ exit handle (void) {int I; remove_proc_entry ("static_nat", init_net.proc_net); terminate (Listen 4_nat_ops, ARRAY_SIZE (ipv4_nat_ops); spin_lock (& nat_lock); for (I = 0; I <BUCKETS; I ++) {struct hlist_node * iter, * tmp; struct static_nat_entry * ent; hlist_for_each_safe (iter, tmp, & src_list [I]) {ent = hlist_entry (iter, struct static_nat_entry, node [0]); hlist_del (& ent-> node [DIR_SNAT]); hlist_del (& ent-> node [DIR_DNAT]); kfree (ent) ;}} spin_unlock (& nat_lock); if (src_list) {kfree (src_list);} if (dst_list) {kfree (dst_list) ;}} module_init (nf_static_nat_init); module_exit (nf_static_nat_fini); MODULE_DESCRIPTION ("STATIC two-way NAT"); MODULE_AUTHOR ("marywangran@126.com "); MODULE_LICENSE ("GPL ");
Makefile:
obj-m += nf_rawnat.oall: make -C /lib/modules/`uname -r`/build SUBDIRS=`pwd` modulesclean:rm -rf *.ko *.o .tmp_versions .*.mod.o .*.o.cmd *.mod.c .*.ko.cmd Module.symvers modules.order
I'm not optimistic about app-related private activities. It doesn't mean I don't like them, it doesn't mean I don't. After all, I am tired of my work. Why should I continue to be tired. In my spare time, you have to do something else. What else is it? Stateless NAT... this is something that many people do not get involved in. I sigh, I am sad, Weis, who are together with me ?...