Linux route-Based Access Control

Source: Internet
Author: User

People have already discussed a lot of solutions about access control, but do not think that some solutions are universally applicable! RBAC is not suitable for operating systems with macro kernel protocol stacks (UNIX, Linux, and so on... I may have reversed them in my previous article ,...), but not everyone knows this, including myself!
In short, the packet processing process in the kernel protocol stack must conform to the "fast, cool" style, and cannot cause subsequent packet queuing, because different data packets may belong to different "business logic ". You cannot assume that the business logic represented by the following data packets will tolerate the delay of the previous data packets, and each person signs any protocol. For network devices, each data packet is equal, if a device is more advanced, every data stream is equal. Remember that the device is born for data forwarding, and the data destination is not this device!
All the intermediate devices are TMD floating clouds. If a so-called WEB Acceleration Server does not use cache, It is a scam! What can be faster than wire speed forwarding? No! No cao tmd! As long as there is a device between the Client and the Server, it will increase the TMD latency. Don't expect it to reduce the latency unless God is constipation! Therefore, when designing intermediate devices, the goal is definitely not to reduce latency in a mythical way, but to "be as fast as possible within your control", so, the Processing Method of the application is not suitable. Don't expect any libraries to improve efficiency. Remember, the ultimate control will always be in the hands of the operator, and will always be in the hands of network administrators who do not know programming, if you just cut off the network cable, no matter how TCP does not guarantee that you can send FIN elegantly, who can control the network can decide everything, and the network efficiency is always more important than the efficiency of end-to-end programs!
The purpose of this article is to ensure local processing in one way without considering anything else. In the heavy rain of one day on Saturday, I was excited! I am not afraid of programming in terms of network management. For network management and programmers, perhaps OTO. Feng. Bismarck can only say that it is unfortunate and angry!
Although I do not understand OpenSSL, I am a programmer!
Later, in the previous article "Application of Linux route table abstraction extension to nf_conntrack", it pointed out that the Linux IP address routing mechanism can be used to implement the access control list, that is, the Linux ACL, in that article, I only explained the feasibility of the implementation. However, in the end, I used the Linux IP routing mechanism to save an info string for any nf_conntrack. This article describes how to implement access control.
Isn't there iptables? Why should we implement a new mechanism? Because:
1. iptables is implemented based on Netfilter, so it can only implement serial filtering, that is, match one by one;
2. The matching speed depends on the configuration sequence of iptables rules and cannot be optimized uniformly in the kernel;
3. I don't like iptables. It is outdated in the multi-core era. Although a parallel version can be developed based on Netfilter, it is too difficult.

Therefore, to solve the problem above, I decided to implement another access control mechanism that does not rely on Netfilter. Obviously, one of my goals is that multiple cores can be scheduled during the matching process.
Implementing a set of access control mechanisms for IP datagram is not simple, but the most basic framework is very simple, it is based on the source IP address and target IP address of a data packet to determine what the data packet can do. Of course, in addition to the IP address, any field in the IP data packet can be involved in the match, and even the fields in the TCP/UDP header can also be involved in the match. Yes, however, this article does not cover the complicated situations, which are easily extended from the ideas in this article. Therefore, this article only uses IP addresses as the matching elements.
As described in the Linux route table abstraction extension for nf_conntrack, an Action can be saved in one route entry. However, if another target IP address of the matching element is introduced, this Action can no longer be stored in a route entry. It serves as a medium between the route entry associated with the source IP address and the route entry associated with the target IP address, indicates that "All IP addresses in the source route entry can perform actions on all IP addresses in the target route entry ". Actions can be all possible actions, such as pass, reject, and address translation. If no Action is associated with two route entries, the default Action is executed. It is conceivable that an Action must have two roles. On the one hand, it is added to the Action linked list of the source route entry, and on the other hand, it is added to the Action linked list of the target route entry to associate the two. By using the RBAC permission model term, I can regard the source IP address of a data packet as a role, while the target IP address accessed by a data packet can be considered as a resource. Here I just borrow the term, the real RBAC is much more complicated than mine. Later, I will say that my model focuses on the way and Algorithm for implementing the ACL in the kernel, rather than the ACL model itself.
The data structure is more direct than the graph. Because each Action is linked to two linked lists at the same time, you can name an Action as Xnode:

Struct nf_action_node {// struct list_head list; // related to the algorithm. For details, see annotation 1 struct hlist_node nf_dst; struct hlist_node nf_src; struct nf_fiber _node * dst_node; // reverse pointer, algorithm optimization, see the search algorithm section struct nf_fib_node * src_node; // reverse pointer and algorithm optimization. See the search algorithm section int extra; // int extra_len; // char extra [0];};
Nf_dst is used to link to a route entry of the target route table, while nf_src is used to link to a route entry of the source route table. The reason why the hash list is used instead of the general list_head is to be efficient, I don't want to go into the kernel for the traversal table... of course, in addition to the hash list, you can also use any more efficient data structure, such as the XX tree. In addition, the route table itself does not need hash. This article just happens to use it, you can use the TRIE tree. In the above action data structure, it is worth noting that the last two lines are commented out. My intention is to use any type as action, that is, you can define extra as any data structure, however, this is just a blog and does not force me to implement it perfectly. So I will not implement it until I click it. In addition, the src/dst_node fields are used to locate the two route entries linked to the nf_action_node. Although the coupling between the two nodes is increased, the efficiency is improved, we do not program in OOD, but in kernel.
We already know that an nf_action_node is linked to two hlist tables at the same time. The two hlist columns are classified into two route items. Next we need to define the route items:
struct nf_fib_node {        struct hlist_node       nf_fn_hash;        struct hlist_head       *act_list;        __be32                  fn_key;        int len;        unsigned int    hash_value;        struct nf_action_node *def;};
The act_list field in the struct is the preceding hlist, while nf_fn_hash is the node ID of the route entry in the route table. Note that the value of hash_value is directly related to the search efficiency. In the insert algorithm of action, when I call nf_fib_node of the dst route table and nf_fib_node of the src route table corresponding to the action, and insert nf_action_node into the hlist of the src route entry, it uses the hash_value of its peer. On the contrary, when the dst route entry is inserted, it also uses the hash_value of its peer, in this way, when traversing its act_list from any of the src route entries or dst route entries, you do not need to traverse all the complete lists. Instead, you only need to traverse the linked list with the same hash_value as its peer, this is why hlist is used instead of list.
Note 1:
There is also an optimization for action Search, that is, linking all actions together through the list and traversing this list. The dst_node must be equal to the corresponding dst route entry, src_node is equal to the corresponding src route entry. Generally, there are few actions. This action is faster. when the length of the action linked list exceeds a threshold value, it is converted to the hlist query using the route item. Why did I comment it out? This is just a blog. No one asked me to finish it at the specified time. I didn't implement it for the time being. I didn't leave the unused variable and commented it out. In fact, there are many examples of similar ideas in the Linux kernel. For example, vm_area_struct is in different lists at the same time. list_head is used for traversal and tree is used for search.
At this point, there is no graph to show what I mean. It is too difficult to draw a graph. Even so, I have to try it, after all, the graphic force is also a capability:
########### Act
Note: In this example, Act3 will be applied to access the destination IP address in the range of DST route Entry 1 from the source IP address in the range of SRC route Entry 1. Similarly, Act5 is used between SRC2 and DST2, use the default Act between SRC1 and DST2, because they cannot be connected through any Act node. To set the Act between them, create a new one, add it to the SRC1 linked list and the DST2 linked list.
After understanding the principles and basic data structure, I will explain the algorithm. It is very simple and consists of three steps:
1. Use the source IP address of the data packet to match a route entry in the source route table, which is marked as route Entry 1;
2. Use the destination IP address of the data packet to match a route entry in the destination route table, which is marked as route entry 2;
3. Obtain an Act in the linked list of Route Item 1. If this Act is satisfied, it is also in the linked list of Route item 2, or the opposite...

I will not explain much about the above two points. I can look at the Linux kernel code and RTFSC. The key is 3rd points. It can be optimized a lot, the most direct way is to use the list_head generic two-way linked list, but I use a data structure that is a little more complex than it, that is, hlist_head, because there is not much surplus time, I didn't use any more complex data structures to show off my skills. The hash linked list can improve the efficiency. The key point is that it can split a whole chain table into several smaller linked lists with much shorter values. each node in each small chain table carries the same hash value, is a hash conflicting linked list. The key to efficiency lies in the value of this hash value.
The method I used is to use the address in the memory of a route entry as hash, and perform remainder operations based on the size of the hash bucket, which is recorded as K, then, add the Act associated with the peer to the conflicting linked list with the hash value of the peer to K. The opposite is true, so that I do not need to traverse the entire linked list of the route entry, you only need to traverse the small chain table whose hash value is K. In extreme cases, there is only one node in the small chain table that you are looking, all you need to do is calculate a hash value. This is very simple. You have two route entries in your hands. Get the address.
I have confidence in this algorithm because there is a small trick, that is, the layout of routing items in the memory can be arranged, especially when using the kmem_cache mechanism of Linux, it can arrange route entries in a compact memory in order, and the space is not very large. According to the storage of 1000 route entries, the continuous memory size is 1000 * sizeof (struct nf_fib_node). In this way, the size of the hash bucket can be calculated. It is very simple, that is, 1000! It is now quite common to change the time by space. We have long begun to reduce the earth and shorten the distance, but the time is still as relentless as in ancient Rome.
I have reserved two reverse pointers in the nf_action_node struct pointing to related routing items. This coupling is also introduced to improve efficiency. When traversing the conflicting linked list, obtain the peer route entry from the Act structure directly. If you do not need these two pointers, you can also use the "coloring" method. The two threads traverse the hash conflicting linked list from two route entries at the same time and traverse each node, coloring for the node is actually marking the peer information, for example, marking the peer address. When the peer traverses the information, first check whether there is any mark, if yes, check whether it is yourself. If yes, it is it! The pseudocode is as follows:
List_for_each if there is a tag if you are looking for the Information endend_for that you have found the next end else tag peer in else.
Of course, this coloring mechanism requires the protection of a lock, In order to fear that the subsequent coloring will overwrite the previous coloring. What is the lock? In terms of efficiency, do not use spin locks. read/write locks or RCU locks are acceptable.
Code

It's time to show the Code. For debugging convenience, I didn't directly write the kernel modules, but transplanted them to the user State. There are not many sections to be transplanted, which is also very simple, it is a hlist-related thing and a route table structure. The first is the header file route_extra.h:

# Ifndef ROUTE_EXTRA_H # define ROUTE_EXTRA_H # include <stdio. h> # include <stdlib. h> # include <arpa/inet. h ># define _ be32 u_int32_ttypedef unsigned int u32; struct hlist_node {struct hlist_node * next, ** pprev;}; struct hlist_head {struct hlist_node * first ;}; struct nf_fn_zone {struct nf_fn_zone * fz_next; struct hlist_head * fz_hash; int fz_divisor; u32 fz_hashmask; # define Merge (fz)-> fz_hashmas K) int fz_order; _ be32 fz_mask; # define FZ_MASK (fz)-> fz_mask)}; struct nf_fn_hash {struct nf_fn_zone * 33]; struct failed * failed;}; struct failed; struct nf_action_node {struct list_head list; struct hlist_node nf_dst; struct hlist_node nf_src; struct failed * dst_node; struct failed * src_node; int extra; // int extra_len; // char extra [0];}; struct nf_fib _ Node {struct hlist_node nf_fn_hash; struct hlist_head * act_list; _ be32 fn_key; int len; unsigned int hash_value; struct nf_action_node * def ;}; static _ inline _ be32 inet_make_mask (int logmask) {if (logmask) return htonl (~ (1 <(32-logmask)-1); return 0;} static inline unsigned long ffz (unsigned long word) {asm ("bsf % 1, % 0 ": "= r" (word): "r "(~ Word); return word;} static _ inline _ int inet_mask_len (_ be32 mask) {uint32_t hmask = ntohl (mask); if (! Hmask) return 0; return 32-ffz (~ Hmask);} extern struct nf_fn_hash * acl_route_table; // No interface is exported yet # endif
Then there is the C file. This C file contains a main function for testing. My test is very simple. I wanted to write another search test using pthread. After hearing the decrease in the rain, I have no motivation anymore...

# Include "route_extra.h" # define offsetof (TYPE, MEMBER) (unsigned long) & (TYPE *) 0)-> MEMBER) # define hlist_for_each_entry_safe (tpos, pos, n, head, member) \ for (pos = (head)-> first; \ pos & ({n = pos-> next; 1 ;}) & \ ({tpos = hlist_entry (pos, typeof (* tpos), member); 1 ;}); \ pos = n) # define hlist_for_each_entry (tpos, pos, head, member) \ for (pos = (head)-> first; \ pos & ({tpos = hlist_entry (p OS, typeof (* tpos), member); 1 ;}); \ pos = pos-> next) # define hlist_entry (ptr, type, member) container_of (ptr, type, member) # define container_of (ptr, type, member) ({\ const typeof (type *) 0)-> member) * _ mptr = (ptr ); \ (type *) (char *) _ mptr-offsetof (type, member);}) typedef unsigned int u_int32_t; static inline void INIT_HLIST_NODE (struct hlist_node * h) {h-> next = NULL; h-> pprev = NULL;} static Inline void _ hlist_del (struct hlist_node * n) {struct hlist_node * next = n-> next; struct hlist_node ** pprev = n-> pprev; * pprev = next; if (next) {next-> pprev = pprev;} static inline void hlist_add_head (struct hlist_node * n, struct hlist_head * h) {struct hlist_node * first = h-> first; n-> next = first; if (first) first-> pprev = & n-> next; h-> first = n; n-> pprev = & h-> first;} static inline u_int32_t Struct (u_int32_t dst, struct nf_fn_zone * fz) {return dst & FZ_MASK (fz);} static inline u32 nf_fn_hash (u_int32_t key, struct nf_fn_zone * fz) {u32 h = key> (32-fz-> fz_order); h ^ = (h> 20); h ^ = (h> 10 ); h ^ = (h> 5); h & = FZ_HASHMASK (fz); return h;} static struct hlist_head * fz_hash_alloc (int divisor) {unsigned long size = divisor * sizeof (struct hlist_head); return calloc (1, size);} static struct Nf_fn_zone * fn_new_zone (struct nf_fn_hash * table, int z) {int I; struct nf_fn_zone * fz = calloc (1, sizeof (struct nf_fn_zone); if (! Fz) return NULL; if (z) {fz-> fz_divisor = 16;} else {fz-> fz_divisor = 1 ;} fz-> fz_hashmask = (fz-> fz_divisor-1); fz-> fz_hash = fz_hash_alloc (fz-> fz_divisor); if (! Fz-> fz_hash) {free (fz); return NULL;} fz-> fz_order = z; fz-> fz_mask = inet_make_mask (z); for (I = z + 1; I <= 32; I ++) if (table-> nf_fn_zones [I]) break; if (I> 32) {fz-> fz_next = table-> nf_fn_zone_list; table-> nf_fn_zone_list = fz;} else {fz-> fz_next = table-> nf_fn_zones [I]-> fz_next; table-> nf_fn_zones [I]-> fz_next = fz ;} table-> nf_fn_zones [z] = fz; return fz;} // route table operation interface: 1. search; 2. delete. Too many parameters, similar to Win32 API, with poor style, but convenient int partition (struct nf_fn_hash * table, const u_int32_t dst, const u_int32_t mask, int del_option, struct nf_fi_node ** resf) {int rv = 1; struct nf_fn_zone * fz; struct nf_fib_node * del_node = NULL; if (NULL = table) {printf ("INFO: route_table uninitialed! "); Return 1 ;}for (fz = table-> nf_fn_zone_list; fz = fz-> fz_next) {struct hlist_head * head; struct hlist_node * node; struct nf_fib_node * f; u_int32_t k = nf_fz_key (dst, fz); head = & fz-> fz_hash [nf_fn_hash (k, fz)]; terminate (f, node, head, nf_fn_hash) {if (f-> fn_key = k) {if (1 = del_option & mask = FZ_MASK (fz) {del_node = f ;} else if (0 = del_option) {* resf = f;} rv = 0; g Oto out ;}} rv = 1; out: if (del_node) {_ hlist_del (& del_node-> nf_fn_hash); free (del_node);} return rv ;} static inline void forward (struct nf_fn_zone * fz, struct nf_fib_node * f) {struct hlist_head * head = & fz-> fz_hash [nf_fn_hash (f-> fn_key, fz)]; hlist_add_head (& f-> nf_fn_hash, head);} int nf_route_table_search (struct nf_fn_hash * table, u_int32_t dst, struct nf_fib_node ** resf) {return nf _ Route_table_opt (table, dst, 32, 0, resf);} int nf_route_table_delete (struct nf_fn_hash * table, u_int32_t network, u_int32_t mask) {return partition (table, network, mask, 1, NULL);} int evaluate (struct nf_fn_hash * table, u_int32_t network, u_int32_t netmask, struct nf_fi_node ** node) {struct nf_fi_node * new_f; struct nf_fn_zone * fz; new_f = calloc (1, sizeof (struct nf_fib_node )); New_f-> len = inet_mask_len (netmask); new_f-> act_list = fz_hash_alloc (16); new_f-> fn_key = network; if (new_f-> len> 32) {return-1;} INIT_HLIST_NODE (& new_f-> nf_fn_hash); if (NULL = table) {return-13 ;} fz = table-> nf_fn_zones [new_f-> len]; if (! Fz &&! (Fz = fn_new_zone (table, new_f-> len) {return-1;} fig (fz, new_f); // danger * node = new_f; return 0 ;} void nf_route_table_clear (struct nf_fn_hash * table) {struct nf_fn_zone * fz, * old; if (NULL = table) {printf ("INFO: route_table was NULL, no need to clear! "); Return ;}for (fz = table-> nf_fn_zone_list; fz;) {if (fz! = NULL) {free (fz-> fz_hash); fz-> fz_hash = NULL; old = fz; fz = fz-> fz_next; free (old ); old = NULL ;}} free (table); return;} int round (struct nf_fn_hash * src_table, struct nf_fn_hash * dst_table, uint32_t src_addr, uint32_t dst_addr) {int rv = 0; int flag = 0; struct cursor * src_node; struct nf_fib_node * dst_node; int src_hash = 0; int dst_hash = 0; struct nf_action_node * action; struct hlist_head * head; struct hlist_node * node; rv = nf_route_table_search (src_table, src_addr, & src_node); if (rv) {rv =-100; goto ret;} rv = nf_route_table_search (dst_table, dst_addr, & dst_node ); if (rv) {rv =-200; goto ret;} // note that the lower the rain, I have not implemented the coloring head = & src_node-> act_list [dst_node-> hash_value]; hlist_for_each_entry (action, node, head, nf_src) {if (action-> dst_node = dst_node) {rv = action-> extra; flag = 1 ;}}if (flag = 0) {rv = src_node-> def-> extra;} ret: return rv ;} int round (struct nf_fn_hash * src_table, struct nf_fn_hash * dst_table, primary src_network, primary dst_network, primary src_netmask, primary sequence, struct nf_action_node * action, struct nf_action_node * def) {int rv = 0; struct nodes * src_node; struct nf_fib_node * dst_node; struct hlist_head * src_head; struct hlist_head * dst_head; rv = primary (src_table, src_network, primary, & src_node ); if (rv) {rv = 1; goto ret;} src_node-> def = def; // note the src_node-> hash_value = (unsigned int) src_node Calculation of hash values) % 16; rv = nf_route_table_add (dst_table, dst_network, dst_netmask, & dst_node); if (rv) {rv = 1; goto ret;} dst_node-> def = def; dst_node-> hash_value = (unsigned int) dst_node) % 16; action-> src_node = src_node; action-> dst_node = dst_node; INIT_HLIST_NODE (& action-> nf_src ); INIT_HLIST_NODE (& action-> nf_dst); src_head = & src_node-> act_list [dst_node-> hash_value]; hlist_add_head (& action-> nf_src, src_head ); dst_head = & src_node-> act_list [src_node-> hash_value]; hlist_add_head (& action-> nf_dst, dst_head); ret: return rv;} int main (int argc, char ** argv) {int rv = 0; struct nf_action_node * def, * act1, * act2; struct nf_fn_hash * src_table; struct nf_fn_hash * dst_table; struct nf_fib_node * src_node = NULL; struct nf_fib_node * dst_node = NULL; src_table = calloc (1, sizeof (struct nf_fn_hash); dst_table = calloc (1, sizeof (struct nf_fn_hash )); /* Add default route entry */def = calloc (1, sizeof (struct nf_action_node); def-> extra = 10000; act1 = calloc (1, sizeof (struct nf_action_node); act1-> extra = 1; act2 = calloc (1, sizeof (struct nf_action_node); act2-> extra = 2; rv = nf_route_policy_add (src_table, dst_table, 0x00000000, 0x00000000, 0x00000000, 0x00000000, act1, def); rv = nf_route_policy_add (src_table, dst_table, 0x0000a8c0, 0x00001010, 0x0000ffff, 0x0000ffff, act1, def); rv = nf_route_policy_search (src_table, dst_table, 0x0102a9c0, 0x01021010); printf ("% d \ n", rv); return 0 ;}

In the above Code, the definition of extra is too casual. Would you like to extend it further? But the rain is getting smaller and smaller. For me, every day without heavy rain is suffering. Dora, JI Dora!
Meaning rather than meaning of parallel execution! Parallel Execution is the biggest feature of my design. In a multi-core environment, you need to know that although the entire matching process is divided into three steps, there is actually only one step, the two route searches can be performed at the same time. As you can see, the above descriptions and codes are almost all repetitive, that is, the same logic is written twice, and all of them are symmetric, but you must know that this is not enough. More work is about how you stuff these symmetric operations into two executors without introducing the overhead that outweighs the candle, as Shu benhua understands. The maximum overhead is the synchronization overhead.
The world is like this, and life is like this. With excitement, you introduce complexity. Sometimes you don't have to pay much for what you get, but it is just this excitement that keeps you going forward, until it can no longer afford... in view of this, I have to talk about the balance art.
In fact, the whole world is made up of balance. What is the mean? It doesn't seem like this. Otherwise, the world will not move forward. For civilization and culture, the true balance lies in the balance between collapse and reconstruction. The design balance design is different from this. The practical technology and aesthetic appreciation art are static and technical beauty, it is because the weights on both sides are constantly changing. You must change the fulcrum to maintain the balance. From the early terminal era to the current post-terminal era, you will see this balance. For an art like painting and music, balance is just right in the collapse, and the contrast is reflected in the collapse. In the end, it tends to be a little bit. The energy released in the process will touch your nerves.
This reminds me of He Yong, who said, "I am the biggest dumb. Then, we wiped the nose on the reporter's camera.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.