Linux kernel Analysis

Source: Internet
Author: User
Tags goto hash

Kernel version: 2.6.34

The previous routing table http://blog.csdn.net/qy532846454/article/details/6423496 describes the structure of the routing table and the creation of the routing table. The following are the details of the use of some routing tables for additional explanation.

Routes can be divided into two parts: routing caching (rt_hash_table) and routing tables ()

Routing caching, by definition, is an accelerated route lookup, and the insertion of the routing cache is controlled by the kernel rather than by human inserts, in contrast to the routing table being artificially inserted rather than being inserted by the kernel. In the kernel, routing caching is organized into rt_hash_table structures.

The following is a section of the IP layer protocol [NET/IPV4/ROUTE.C], where the protocol for the incoming IP layer is found in the routing cache first, if it already exists, Skb_dst_set (SKB, &rth->u.dst) and returns Otherwise, query in the routing table.

 [CPP] View plaincopy hash = Rt_hash (daddr, Saddr, IIF, Rt_genid (net));     
    Rcu_read_lock ();     
         for (Rth = Rcu_dereference (rt_hash_table[hash].chain); rth;     
             Rth = Rcu_dereference (Rth->u.dst.rt_next)) {if ((rth->fl.fl4_dst ^ daddr) |     
             (rth->fl.fl4_src ^ saddr) |     
             (rth->fl.iif ^ iif) |     
             Rth->fl.oif | (rth->fl.fl4_tos ^ tos)) = = 0 && Rth->fl.mark = = Skb->mark && net_eq (dev_net rth->u.dst.     
            DEV), net) &&!rt_is_expired (Rth)) {Dst_use (&RTH->U.DST, jiffies);     
            Rt_cache_stat_inc (In_hit);     
            Rcu_read_unlock ();     
            Skb_dst_set (SKB, &RTH->U.DST);     
        return 0;     
    } rt_cache_stat_inc (In_hlist_search); } rcu_read_unlock (); 

In Ip_route_input () query completion by the cache will process the multicast address, if it is a multicast address, the following judgment will succeed: Ipv4_is_multicast (DADDR).

The IP_ROUTE_INPUT_MC () is then executed, and its primary role is to generate the route cache entry Rth and insert the cache. The generation and initialization of Rth only gives the input function, other omitted, can see that the group broadcast text will continue to pass through Ip_local_deliver ().

rth->u.dst.input= Ip_local_deliver;     
hash = Rt_hash (daddr, Saddr, Dev->ifindex, Rt_genid (dev_net (Dev));     
Return Rt_intern_hash (hash, Rth, NULL, SKB, Dev->ifindex);

The routing table can also be divided into two: Rt_table_local and Rt_table_main

The rt_table_local storage destination address is the local routing table entry, which is the IP address configured for each network card;

Rt_table_main The routing table entries that are stored to other hosts;

Obviously, the Rt_table_main routing table is only useful when the host is a router, and the table is empty because the host does not have the ability to forward packets. Rt_table_local is sufficient for the host, the IP address configured for each NIC will be added to the rt_table_local, such as 1.2.3.4 address for eth1, rt_table_local Route entry will exist in 1.2.3.4. Only the local network card address will be added, such as Lo, eth1. When the IP module is initialized ip_init ()-> ip_rt_init ()-> Ip_fib_init () registers the notifier mechanism, which executes fib_netdev_notifier and FIB_ when configured for the network adapter address Inetaddr_notifier, so that the changes are reflected in the rt_table_local.

Register_netdevice_notifier (&fib_netdev_notifier);     
Register_inetaddr_notifier (&fib_inetaddr_notifier);

When a cache entry is not found in the routing cache, a routing table query is performed, or the code snippet in the IP layer protocol is taken as an example [net/ipv4/route.c],fib_lookup () is searched in main and local two tables.

if ( 

err = fib_lookup (NET, &FL, &res))!= 0) {     
    if (! In_dev_forward (In_dev))     
        goto E_hostunreach;     
    Goto No_route;     
}

If the host is configured to support forwarding, it will generate a cache of this query, whether found in the routing table or not, including source IP, destination IP, received network card, insert route cache:

hash = Rt_hash (daddr, Saddr, Fl.iif, Rt_genid (net));     
Err = Rt_intern_hash (hash, Rth, NULL, SKB, FL.IIF);

The difference is that if the query fails in the routing table, that is, if the packet is not sent to the local computer and cannot be forwarded by the native, the cache entry u.dst.input=ip_error that is inserted into the route cache is set, and U.dst.input is the function that is passed up after the IP layer is processed, and ip_error () the packet is discarded and the corresponding ICMP error message is sent. Routing items that are not in the routing table are also inserted into the routing cache, which can be considered a routing learning feature that can be found directly in the routing cache the next time.

rth->u.dst.input= Ip_error;     
Rth->u.dst.error=-err;     
Rth->rt_flags    &= ~rtcf_local;

However, if the host does not support forwarding, that is, there is no routing feature, the route cache entry is added only when it is found, and the route cache entry is not generated. This is because it is not found in the local table, indicating that the packet is not sent to the native, at which point the cached route item has no meaning for the host's packet transfer. It just needs to know what packets are sent to it, and the rest of it!

Routing queries are consolidated by Ip_route_input (), followed by routing caching and routing table queries, and updating the routing cache. The routing cache may be updated each time the packet arrives, but the routing table is different and can only be updated through the RTM mechanism, the local table is updated when the NIC is configured, and the main table is manually inserted (Inet_rtm_newroute).

Ip_route_input ()

-Routing Cache Query

-routing Table query: Ip_route_input_slow ()-> fib_lookup ()

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.