Linux Kernel VPN implementation source code analysis (3)

Source: Internet
Author: User
Tags htons

In the previous article, I briefly introduced the ipip module and the initialization of the tunnel. Now I want to introduce the process of sending and receiving packets using the ipip protocol.

Before sending and receiving packets, it is worth noting that MTU is a problem. Since the packet passes through the ipip protocol, the length of an ipip header is added to the original length, therefore, MTU using the ipip protocol must reduce the length of an ipip header. The Code is as follows.

Static int ipip_tunnel_change_mtu (struct net_device * Dev, int new_mtu) <br/>{< br/> If (new_mtu <68 | new_mtu> 0xfff8-sizeof (struct iphdr )) <br/> return-einval; <br/> Dev-> MTU = new_mtu; <br/> return 0; <br/>}

New_mtu <68 is because the minimum MTU length is 60 bytes. The link layer package smaller than 60 bytes must be filled with 0.

The receiving process function is clear and simple.

Static int ipip_rcv (struct sk_buff * SKB) <br/>{< br/> struct ip_tunnel * tunnel; <br/> const struct iphdr * IPH = ip_hdr (SKB ); <br/> read_lock (& ipip_lock); <br/> If (Tunnel = ipip_tunnel_lookup (dev_net (SKB-> Dev), <br/> IPH-> saddr, IPH-> daddr ))! = NULL) {<br/> If (! Xfrm4_policy_check (null, xfrm_policy_in, SKB) {<br/> read_unlock (& ipip_lock); <br/> kfree_skb (SKB); <br/> return 0; <br/>}< br/> secpath_reset (SKB); <br/> SKB-> mac_header = SKB-> network_header; <br/> skb_reset_network_header (SKB ); <br/> SKB-> protocol = htons (eth_p_ip); <br/> SKB-> pkt_type = packet_host; <br/> tunnel-> Dev-> stats. rx_packets ++; <br/> tunnel-> Dev-> stats. rx_bytes + = SKB-> Len; <br/> SKB-> Dev = tunnel-> dev; <br/> skb_dst_drop (SKB); <br/> nf_reset (SKB ); <br/> ipipip_ecn_decapsulate (IPH, SKB); <br/> netif_rx (SKB); <br/> read_unlock (& ipip_lock); <br/> return 0; <br/>}< br/> read_unlock (& ipip_lock); <br/> return-1; <br/>}< br/>/
 

The ipip_tunnel_lookup function is used to find the tunnel based on the source destination address and other information. If the tunnel cannot be found, bypass is used. Then, the IP address header is removed, but the IP address is not actually removed. Only SKB-> mac_header = SKB-> network_header; skb_reset_network_header (SKB) is called, and the IP address is removed logically, after being stripped, it is set to a common IP protocol. Next, call the netif_rx function and submit it to the IP layer of the upper layer for processing. The receiving process ends.


Find the tunnel function ipipip_tunnel_lookup
Function implementation:

Static struct ip_tunnel * ipip_tunnel_lookup (struct net * Net, <br/>__ be32 remote, _ be32 local) <br/>{< br/> unsigned h0 = hash (remote); <br/> unsigned H1 = hash (local); <br/> struct ip_tunnel * t; <br/> struct ipip_net * IPN = net_generic (net, ipip_net_id); <br/> for (t = IPN-> tunnels_r_l [H0 ^ H1]; t; T = T-> next) {<br/> If (local = T-> parms. IPH. saddr & <br/> remote = T-> parms. IPH. daddr & (t-> Dev-> flag S & iff_up) <br/> return t; <br/>}< br/> for (t = IPN-> tunnels_r [H0]; t; T = T-> next) {<br/> If (Remote = T-> parms. IPH. daddr & (t-> Dev-> flags & iff_up) <br/> return T; <br/>}< br/> for (t = IPN-> tunnels_l [H1]; t = T-> next) {<br/> If (local = T-> parms. IPH. saddr & (t-> Dev-> flags & iff_up) <br/> return T; <br/>}< br/> If (t = IPN-> tunnels_wc [0])! = NULL & (t-> Dev-> flags & iff_up) <br/> return t; <br/> return NULL; <br/>}

In fact, it is a process of searching for hash Based on the tunnel IP address (the tunnel IP address is the source destination IP address that did not peel off the ipip header, not the source destination IP address of the original unencapsulated data ).

The following is the data sending function.

Static int ipip_tunnel_xmit (struct sk_buff * SKB, struct net_device * Dev) <br/>{< br/> struct ip_tunnel * tunnel = netdev_priv (Dev ); <br/> struct net_device_stats * stats = & tunnel-> Dev-> stats; <br/> struct iphdr * TIPH = & tunnel-> parms. IPH; <br/> u8 TOS = tunnel-> parms. IPH. toS; <br/>__ be16 df = TIPH-> frag_off; <br/> struct rtable * RT; /* route to the other host */<br/> struct net_device * tdev;/* device Other host */<br/> struct iphdr * old_iph = ip_hdr (SKB); <br/> struct iphdr * IPH; /* our new IP header */<br/> unsigned int max_headroom;/* the extra header space needed */<br/>__ be32 DST = TIPH-> daddr; <br/> int MTU; <br/> If (Tunnel-> recursion ++) {<br/> stats-> collisions ++; <br/> goto tx_error; <br/>}< br/> If (SKB-> protocol! = Htons (eth_p_ip) <br/> goto tx_error; <br/> If (TOS & 1) <br/> TOS = old_iph-> TOS; <br/> If (! DST) {<br/>/* nbma tunnel */<br/> If (RT = skb_rtable (SKB) = NULL) {<br/> stats-> tx_polico_errors ++; <br/> goto tx_error; <br/>}< br/> If (DST = RT-> rt_gateway) = 0) <br/> goto tx_error_icmp; <br/>}< br/>{< br/> struct flowi FL = {. OIF = tunnel-> parms. link, <br/>. nl_u = {. ip4_u = <br/> {. daddr = DST, <br/>. saddr = TIPH-> saddr, <br/>. toS = rt_tos (ToS) }}, <br/>. PROTO = ipproto_ipip}; <br/> If (Ip_route_output_key (dev_net (Dev), & RT, & FL) {<br/> stats-> tx_carrier_errors ++; <br/> goto tx_error_icmp; <br/>}< br/> tdev = RT-> U. DST. dev; <br/> If (tdev = Dev) {<br/> ip_rt_put (RT); <br/> stats-> collisions ++; <br/> goto tx_error; <br/>}< br/> If (TIPH-> frag_off) <br/> MTU = dst_mtu (& RT-> U. DST)-sizeof (struct iphdr); <br/> else <br/> MTU = skb_dst (SKB )? Dst_mtu (skb_dst (SKB): Dev-> MTU; <br/> If (MTU <68) {<br/> stats-> collisions ++; <br/> ip_rt_put (RT); <br/> goto tx_error; <br/>}< br/> If (skb_dst (SKB) <br/> skb_dst (SKB) -> OPS-> update_pmtu (skb_dst (SKB), MTU); <br/> DF | = (old_iph-> frag_off & htons (ip_df )); <br/> If (old_iph-> frag_off & htons (ip_df) & MTU <ntohs (old_iph-> tot_len) {<br/> icmp_send (SKB, icmp_dest_unreach, icmp_frag_needed, htonl (MTU); <Br/> ip_rt_put (RT); <br/> goto tx_error; <br/>}< br/> If (Tunnel-> err_count> 0) {<br/> If (time_before (jiffies, <br/> tunnel-> err_time + iptunnel_err_timeo) {<br/> tunnel-> err_count --; <br/> dst_link_failure (SKB); <br/>}else <br/> tunnel-> err_count = 0; <br/>}< br/>/* <br/> * Okay, now see if we can stuff it in the buffer as-is. <br/> */<br/> max_headroom = (ll_reserved_space (tdev) + sizeof (struct IPH Dr); <br/> If (skb_headroom (SKB) <max_headroom | skb_shared (SKB) | <br/> (skb_cloned (SKB )&&! Skb_clone_writable (SKB, 0) {<br/> struct sk_buff * new_skb = skb_realloc_headroom (SKB, max_headroom); <br/> If (! New_skb) {<br/> ip_rt_put (RT); <br/> stats-> tx_dropped ++; <br/> dev_kfree_skb (SKB ); <br/> tunnel-> Recursion --; <br/> return 0; <br/>}< br/> If (SKB-> SK) <br/> skb_set_owner_w (new_skb, SKB-> SK); <br/> dev_kfree_skb (SKB); <br/> SKB = new_skb; <br/> old_iph = ip_hdr (SKB); <br/>}< br/> SKB-> transport_header = SKB-> network_header; <br/> skb_push (SKB, sizeof (struct iphdr); <br/> skb_reset_network_header (SKB); <br/> Memset (& (IPCB (SKB)-> OPT), 0, sizeof (IPCB (SKB)-> OPT); <br/> IPCB (SKB) -> flags & = ~ (Ipskb_xfrm_tunnel_size | ipskb_xfrm_transformed | <br/> ipskb_rerouted); <br/> skb_dst_drop (SKB); <br/> skb_dst_set (SKB, & RT-> U. DST); <br/>/* <br/> * Push down and install the ipip header. <br/> */<br/> IPH = ip_hdr (SKB); <br/> IPH-> Version = 4; <br/> IPH-> IHL = sizeof (struct iphdr)> 2; <br/> IPH-> frag_off = DF; <br/> IPH-> protocol = ipproto_ipip; <br/> IPH-> TOS = inet_ecn_encapsulate (TOS, old_iph-> ToS); <br/> IPH-> daddr = RT-> rt_dst; <br/> IPH-> saddr = RT-> rt_src; <br/> If (IPH-> TTL = TIPH-> TTL) = 0) <br/> IPH-> TTL = old_iph-> TTL; <br/> nf_reset (SKB); <br/> iptunnel_xmit (); <br/> tunnel-> Recursion --; <br/> return 0; <br/> tx_error_icmp: <br/> dst_link_failure (SKB); <br/> tx_error: <br/> stats-> tx_errors ++; <br/> dev_kfree_skb (SKB); <br/> tunnel-> Recursion --; <br/> return 0; <br/>}< br/> sta

In general, the sending process mainly includes two aspects: one is to find the route, and the other is to construct a new IP packet according to the ipip protocol.

The process of searching for a route is not detailed here. It is enough to create a new ipip package. First, use skb_headroom to check whether the remaining SKB buff space can accommodate the ipip header, re-use skb_realloc_headroom to allocate a Buf space. In fact, this can also be done using the skb_copy_expand function. Fill in the ipip header field after the allocation.

At this point, most of the logic of the ipip protocol has been clearly described. The following is a function value note.

Static int <br/> ipip_tunnel_ioctl (struct net_device * Dev, struct ifreq * IFR, int cmd) <br/>{< br/> int err = 0; <br/> struct ip_tunnel_parm P; <br/> struct ip_tunnel * t; <br/> struct net * Net = dev_net (Dev ); <br/> struct ipip_net * IPN = net_generic (net, ipip_net_id); <br/> switch (CMD) {<br/> case siocgettunnel: <br/> T = NULL; <br/> If (Dev = IPN-> fb_tunnel_dev) {<br/> If (copy_from_user (& P, IFR-> ifr_if Ru. ifru_data, sizeof (p) {<br/> err =-efault; <br/> break; <br/>}< br/> T = ipip_tunnel_locate (net, & P, 0); <br/>}< br/> If (t = NULL) <br/> T = netdev_priv (Dev ); <br/> memcpy (& P, & T-> parms, sizeof (p); <br/> If (copy_to_user (IFR-> ifr_ifru.ifru_data, & P, sizeof (p) <br/> err =-efault; <br/> break; <br/> case siocaddtunnel: <br/> case siocchgtunnel: <br/> err =-eperm; <br/> If (! Capable (cap_net_admin) <br/> goto done; <br/> err =-efault; <br/> If (copy_from_user (& P, IFR-> ifr_ifru.ifru_data, sizeof (p) <br/> goto done; <br/> err =-einval; <br/> If (P. IPH. version! = 4 | P. iph. protocol! = Ipproto_ipip | <br/> P. iph. IHL! = 5 | (P. iph. frag_off & htons (~ Ip_df) <br/> goto done; <br/> If (P. IPH. TTL) <br/> P. IPH. frag_off | = htons (ip_df); <br/> T = ipip_tunnel_locate (net, & P, cmd = siocaddtunnel); <br/> If (Dev! = IPN-> fb_tunnel_dev & cmd = siocchgtunnel) {<br/> If (T! = NULL) {<br/> If (t-> Dev! = Dev) {<br/> err =-eexist; <br/> break; <br/>}< br/>}else {<br/> If (Dev-> flags & iff_pointopoint )&&! P. iph. daddr) | <br/> (! (Dev-> flags & iff_pointopoint) & P. IPH. daddr) {<br/> err =-einval; <br/> break; <br/>}< br/> T = netdev_priv (Dev ); <br/> ipipip_tunnel_unlink (IPN, T); <br/> T-> parms. IPH. saddr = P. IPH. saddr; <br/> T-> parms. IPH. daddr = P. IPH. daddr; <br/> memcpy (Dev-> dev_addr, & P. IPH. saddr, 4); <br/> memcpy (Dev-> broadcast, & P. IPH. daddr, 4); <br/> ipip_tunnel_link (IPN, T); <br/> netdev_state_change (Dev ); <br/>}< br/> If (T) {<br/> err = 0; <br/> If (cmd = siocchgtunnel) {<br/> T-> parms. IPH. TTL = P. IPH. TTL; <br/> T-> parms. IPH. toS = P. IPH. toS; <br/> T-> parms. IPH. frag_off = P. IPH. frag_off; <br/> If (t-> parms. link! = P. link) {<br/> T-> parms. link = P. link; <br/> ipip_tunnel_bind_dev (Dev); <br/> netdev_state_change (Dev ); <br/>}< br/> If (copy_to_user (IFR-> ifr_ifru.ifru_data, & T-> parms, sizeof (p ))) <br/> err =-efault; <br/>} else <br/> err = (cmd = siocaddtunnel? -Enobufs:-enoent); <br/> break; <br/> case siocdeltunnel: <br/> err =-eperm; <br/> If (! Capable (cap_net_admin) <br/> goto done; <br/> If (Dev = IPN-> fb_tunnel_dev) {<br/> err =-efault; <br/> If (copy_from_user (& P, IFR-> ifr_ifru.ifru_data, sizeof (p) <br/> goto done; <br/> err =-enoent; <br/> If (t = ipip_tunnel_locate (net, & P, 0) = NULL) <br/> goto done; <br/> err =-eperm; <br/> If (t-> Dev = IPN-> fb_tunnel_dev) <br/> goto done; <br/> Dev = T-> dev; <br/>}< br/> unregister_netdevice (Dev); <br/> err = 0; <br/> break; <br/> default: <br/> err =-einval; <br/>}< br/> done: <br/> return err; <br/>}

This function is the place where the application interacts. To put it bluntly, it is the place where the Linux system commands ifconfig and IP commands are used as an excuse, creating tunnels, deleting tunnels, changing IP addresses, and so on, this article will not detail.

All the content of the ipip protocol has been introduced. Is the principle simple? In fact, the ip_gre protocol is similar to the ipip protocol. First, the Protocol was invented by Cisco. in Linux, the implementation code is in ip_gre.c. The specific implementation is similar to the ipip protocol, I will not talk about it here. The implementation of open_swan's IPSec protocol is a little more complicated. There are some more encryption processes, and the previous key exchange process has been carefully studied. The Linux kernel VPN implementation source code analysis is complete. You are welcome to pay attention to the subsequent content.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.