TCP/IP routing cache mechanism, stream identity mechanism, tcp identity

Source: Internet
Author: User

TCP/IP routing cache mechanism, stream identity mechanism, tcp identity
0x01 reason

The previous chapter has learned the network layer (ip layer). The key part is to find the next hop route. This study focuses on the knowledge about routing.

0x02 related structure 1. Routing cache mechanism:
Struct rtable {union {struct dst_entry dst; // destination entry} u; struct flowi fl; // contains the actual hash key, which stores the hash value for finding the route node, the hash value is determined by SRC_IP + DST_IP + TOS. Struct in_device * idev; // The Pointer Points to the IP configuration block of the egress device. Note that the egress device is set as the loopback device int rt_genid; unsigned rt_flags; // can be used by the route table that provides application interfaces. Because there may be multiple routes in a single hash bucket, these routes will conflict. When the garbage collection program processes these route caches, if there is a conflict with high-value routes, low-value routes tend to be cleared, and the routing control flags determines the value of these routes. _ Rt_type; // The type of the route. It determines whether these routes are unicast, multicast, or local routes. _ Be32 rt_dst;/* used to store the destination IP address */_ be32 rt_src;/* The starting IP address of the route path */int rt_iif; /* iif is the index of the ingress input interface */_ be32 rt_gateway; // the IP address of the gateway or neighbor. /* Miscellaneous cached information */_ be32 rt_spec_dst;/* RFC1122 specific destination */struct inet_peer * peer; // used for long-living ip peer, although normal IP packets are stateless, the kernel records some information about IP packets to improve efficiency. It mainly records the packet-id of IP packets to check whether repeated packets are received, you also need to check the packet-id increment .};
1.1 stream identification mechanism:

Struct flowi {/* the following two fields determine the input and output interfaces. iif is the index of the input interface, which is obtained from the ifIndex in the net_device structure. This net_device is the device that receives the message. Oif is the index of the output interface. Generally, the iif or oif value is assigned to a specified route, and the other fields are 0. */Int oif; int iif; _ u32 mark; // firewall mark // The following structure is generic. Therefore, we use a combination to define Ipv4, Ipv6, and DECnet: union {struct {_ be32 daddr; // Destination Address _ be32 saddr; // source address _ u8 tos; // service type identifier _ u8 scope; // range ID} ip4_u; struct {struct in6_addr daddr; struct in6_addr saddr; _ be32 flowlabel;} ip6_u; struct {_ le16 daddr; _ le16 saddr; _ u8 scope;} dn_u;} nl_u; # define fld_dst labels # define fld_src labels # define fld_scope labels # define fld_src labels # define fl6_src labels # define fl4_dst labels # define fl4_src labels # define fl4_scope labels # define fl4_scope labels _ u8 proto; // protocol type _ u8 flags; // ID # define FLOWI_FLAG_ANYSRC 0x01 union {struct {_ be16 sport; // source port _ be16 dport; // destination port} ports; struct {_ u8 type; _ u8 code;} icmpt; struct {_ le16 sport; _ le16 dport;} dnports; _ be32 spi; struct {_ u8 type;} mht;} uli_u; # define fl_ip_sport plugin # define fl_icmp_type plugin # define fl_icmp_code plugin # define fl_ipsec_spi plugin # define fl_mh_type plugin _ u32 secid;/* used by xfrm; see secid.txt */} _ attribute _ (_ aligned _ (BITS_PER_LONG/8 )));

From the above structure definition, we can see that a data packet's source and destination address ports have proto options, user-defined types, and even the inbound and outbound interfaces. Then, through these identifiers, you can uniquely determine the business flow of a user. Then

You can find the route of a specified stream. Well, it can be said that routing is the identification of different business flows in the network, while flowi is the identification of different business flows in the operating system. The kernel extracts the corresponding information from the TCP/IP packet header and fills it in

In the flowi structure, the routing query module then finds the corresponding route for the corresponding stream based on this information. Therefore, flowi is a search key.

2. Route Search Process

Ip_route_input () is used to search for the route for the incoming packet skb. Now, the route hash table in the buffer zone is used to search for the route. If not, call ip_route_input_slow () to find the route table.

1. the Linux kernel does not have a table named a route table. Do not be confused by rtable. It is not a place to store real routes. It is a cache. The FIB table is worth calling the route table.

2. FIB stores all route information. Only when the packet is sent (maybe when the packet is accepted) Can the queried route information be put into the route cache, there is no data in the cache before data communication is performed.

0x03 source code tracking 1. The call process is as follows:

2. Simple Function Analysis 2.1ip_route_input

Int ip_route_input (struct sk_buff * skb, _ be32 daddr, _ be32 saddr, u8 tos, struct net_device * dev) {struct rtable * rth; unsigned hash; int iif = dev-> ifindex; struct net * net; net = dev_net (dev); // whether the route cache will exceed the system setting if (! Rt_caching (net) goto skip_cache; tos & = IPTOS_RT_MASK; // hash query hash = rt_hash (daddr, saddr, iif, rt_genid (net); rcu_read_lock (); for (rth = rcu_dereference (rt_hash_table [hash]. chain); rth = rcu_dereference (rth-> u. dst. rt_next) {if (rth-> fl. fl4_dst ^ daddr) | (rth-> fl. fl4_src ^ saddr) | (rth-> fl. iif ^ iif) | rth-> fl. oif | (rth-> fl. fl4_tos ^ tos) = 0 & rth-> fl. mark = skb-> mark & net_e Q (dev_net (rth-> u. dst. dev), net )&&! Rt_is_expired (rth) {dst_use (& rth-> u. dst, jiffies); RT_CACHE_STAT_INC (in_hit); rcu_read_unlock (); skb_dst_set (skb, & rth-> u. dst); return 0;} struct (in_hlist_search);} rcu_read_unlock (); skip_cache:/* multicast Processing */if (ipv4_is_multicast (daddr) {struct in_device * in_dev; rcu_read_lock (); if (in_dev = _ in_dev_get_rcu (dev ))! = NULL) {int our = ip_check_mc (in_dev, daddr, saddr, ip_hdr (skb)-> protocol); if (our # ifdef CONFIG_IP_MROUTE | (! Round (daddr) & IN_DEV_MFORWARD (in_dev) # endif) {rcu_read_unlock (); return ip_route_input_mc (skb, daddr, saddr, tos, dev, our );}} rcu_read_unlock (); return-EINVAL;} // find the slow route. You need to find the fib route return ip_route_input_slow (skb, daddr, saddr, tos, dev );}

2.2ip _ route_input_slow

Static int ip_route_input_slow (struct sk_buff * skb, _ be32 daddr, _ be32 saddr, u8 tos, struct net_device * dev) {/* slightly */struct net * net = dev_net (dev);/* Disable ip */if (! In_dev) goto out;/* No group packet and broadcast detection */if (%4_is_multicast (saddr) | %4_is_lbcast (saddr) | %4_is_loopback (saddr) goto martian_source; // address verification if (daddr = htonl (0 xFFFFFFFF) | (saddr = 0 & daddr = 0) goto brd_input; /** omitted * // ** check the rationality of the Source and Destination addresses of the package to determine the handling of different packages, start route package */if (err = maid (net, & fl, & res ))! = 0) {/* Find the route */if (! IN_DEV_FORWARD (in_dev)/* Not found. First, check whether the forward flag is enabled */goto e_hostunreach; goto no_route;}/* Find the route flag */free_res = 1; RT_CACHE_STAT_INC (in_slow_tot);/* process routes by type */if (res. type = RTN_BROADCAST) goto brd_input; if (res. type = RTN_LOCAL) {int result;/* If the packet is sent to the local machine, verify that the original address is valid */result = maid (saddr, daddr, tos, net-> loopback_dev-> ifindex, dev, & spec_dst, & itag, skb-> mark); if (result <0) go To martian_source; if (result) flags | = RTCF_DIRECTSRC; spec_dst = daddr; goto local_input;} if (! IN_DEV_FORWARD (in_dev) goto e_hostunreach; if (res. type! = RTN_UNICAST) goto martian_destination;/* When the found route type is pointing to a remote host, add this route to the cache */err = ip_mkroute_input (skb, & res, & fl, in_dev, daddr, saddr, tos); done: in_dev_put (in_dev); if (free_res) fib_res_put (& res); out: return err;/* when the destination address is a broadcast address, or the query route type is broadcast type */brd_input ...... * // * When the searched route points to the local host */local_input:/* allocate the cache route entry space, and assign a value to the route entry */rth = dst_alloc (& ipv4_dst_ops); if (! Rth) goto e_nobufs;/* omitted ...... */rth-> u. dst. input = ip_local_deliver;/* This function is called after the route query is completed to send the message to the upper layer for processing. * // * omitted ...... */rth-> rt_type = res. type; hash = rt_hash (daddr, saddr, fl. iif, rt_genid (net); err = rt_intern_hash (hash, rth, NULL, skb);/* Insert a new route entry to the cache */goto done; /* when no route is found, add a non-reachable route entry to the cache */no_route ...... */}

All the above functions start with the ip_route_input function and are called. When the network adapter receives the packet to the IP layer, the IP layer first performs a route query to determine where to send it. Based on the detection of source and target addresses and other aspects, routes are divided into multicast routes, broadcast routes, unicast routes, and local routes. The Survival policies of these routes in the cache are also different. An absolute function value assignment is rth-> u. dst. input. Finally, the function is distributed at the IP layer through this function.

0x04 Summary

After all, the routing is a little different from the original topic. In the next section, we will learn about tcp layer content. (In the learning process, Do not spray it, just correct it)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.