Message transmission refers to the process of sending messages out of the computer to other systems.
The transmission can be initiated by the L4 layer protocol or by message forwarding.
In an in-depth understanding of the Linux Network Technology Insider--IPV4 message reception (forwarding and local delivery), we can see that message forwarding finally calls Dst_output to interact with the neighbor subsystem and then pass it on to the device driver. Here, we start the transfer from the L4 layer protocol, and finally we go through this process (call Dst_output). This paper discusses some aspects of the transfer initiated by the L4 layer protocol in the IPV4 protocol processing (IP layer).
Big BlueprintLet's take a look at the big picture of the transmission, so that we have a rough idea of the process of transmission.
we see L4 layer protocols (such as TCP, UDP), and some special three-layer protocols (Icmp,raw IP, and so on) that eventually call Dst_output to send the datagram to the driver. the processing before calling Dst_output can be divided into four scenarios in the diagram (regardless of the transmission initiated by the message forwarding). case1a and case1b are mainly for (UDP, ICMP, RAWIP), respectively called Ip_append_data and ip_append_page (in fact, the variant of Ip_append_data), To keep the message in the buffer (not transmitted first), until the buffer needs to refresh, only through the end of ip_push_pending_frames indirectly call Dst_output to complete the transfer work. case2 face TCP and SCTP, the Ip_queue_xmit processing message is called directly and then Dst_output is called. CASE3 calls the Dst_output directly for Rawip and IGMP.
the above classification knowledge for the general situation, there are some special cases, such as TCP in the need to send ACK and reset messages, will use ip_send_reply, and indirectly call Ip_append_data and ip_push_pending _frames. TCP also calls IP_BUILD_AND_SEND_PKT when it transmits an ACK syn. Transfer link-main task of the kernel 1. Query Next Hop Point --Involving the routing subsystem 2. Initializing the IP header --Fill in some fields 3. Processing options --Set some required options (other bloggers will be introduced) 4. Segmentation When the--IP package is too large, it must be segmented before transmission 5. Inspection and --6.netfilter Check --7. Updating statistics ——
Ip_queue_xmit SituationIp_queue_xmit are functions that are used by TCP and SCTP.
Used by TCP, SCTP//SKB: Packet descriptor Flags used by the IPFRAGOK:SCTP to indicate whether the int ip_queue_xmit can be segmented (struct sk_buff *skb, int Ipfragok ) {struct Sock *sk = skb->sk; struct Inet_sock *inet = Inet_sk (SK); The socket struct to pass ip_options_rcu *inet_opt = NULL; struct rtable *rt; struct IPHDR *iph; int res; /* Skip All for this if the packet are already routed, * f.e. by something like SCTP. */Rcu_read_lock (); RT = Skb_rtable (SKB); If the buffer has already set the correct routing information, there is no need to find the routing table if (rt! = NULL) goto packet_routed; /* Make sure we can route this packet. */RT = (struct rtable *) __sk_dst_check (SK, 0); inet_opt = Rcu_dereference (inet->inet_opt); Option to initialize if (RT = = NULL) {__be32 daddr; /* Use correct destination address if we have options. */daddr = inet->daddr; if (inet_opt && INET_OPT->OPT.SRR) DADDR = inet_opt->opt.faddr; {struct Flowi fl = {. OIF = sk->sk_bound_dev_if,. Mark = Sk->sk_mark, . Nl_u = {. Ip4_u = {. Daddr = daddr,. saddr = Inet-> ; saddr,. tos = Rt_conn_flags (SK)}},. proto = Sk->sk_protocol, . Flags = Inet_sk_flowi_flags (SK),. Uli_u = {. Ports = {. sport = Inet->sport,. dport = Inet->dport}} }; /* If This fails, retransmit mechanism of transport layer would * keep trying until route appears or the Connec tion times * itself out. */Security_sk_classify_flow (SK, &FL); if (Ip_route_output_flow (Sock_net (SK), &rt, &FL, SK, 0)) Goto No_route; } sk_setup_caps (SK, &amP;RT->U.DST); } skb_dst_set (SKB, Dst_clone (&RT->U.DST));p acket_routed:if (inet_opt && Inet_opt->opt.is_strictro Ute && RT->RT_DST! = rt->rt_gateway) goto No_route; /* OK, we know where to send it, allocate and build IP header. *///skb-data is moved back to the IP header (not the data segment) Skb_push (SKB, sizeof (struct IPHDR) + (inet_opt? inet_opt->opt.optlen:0)); Skb_reset_network_header (SKB); /* Build the IP header */iph = IP_HDR (SKB); * ((__BE16 *) iph) = Htons ((4 << 12) | (5 << 8) | (Inet->tos & 0xFF)); if (Ip_dont_fragment (SK, &RT->U.DST) &&!ipfragok) Iph->frag_off = htons (IP_DF); else Iph->frag_off = 0; Iph->ttl = Ip_select_ttl (inet, &RT->U.DST); Iph->protocol = sk->sk_protocol; IPH->SADDR = rt->rt_src; IPH->DADDR = rt->rt_dst; /* Transport layer set Skb->h.foo itself. */if (inet_opt && inet_opt->opt.optlen) { IPH->IHL + = Inet_opt->opt.optlen >> 2; Ip_options_build (SKB, &inet_opt->opt, inet->daddr, RT, 0); } ip_select_ident_more (Iph, &RT->U.DST, SK, (Skb_shinfo (SKB)->gso_segs?: 1)-1); Skb->priority = sk->sk_priority; Skb->mark = sk->sk_mark; res = Ip_local_out (SKB); Rcu_read_unlock (); return Res;no_route:rcu_read_unlock (); Ip_inc_stats (Sock_net (SK), ipstats_mib_outnoroutes); KFREE_SKB (SKB); Return-ehostunreach;}
the situation of Ip_push_pending_framesin the big blueprints case1a and CASE1B, we saw that some L4 layers of protocol would put data through ip_append_data or ip_append_page and put the data line in the buffer, and then
show the call Ip_push_pending_frames transmit data. There are two advantages to putting the data in the buffer, on the one hand, the data of the buffer can be used by some subsequent functions to form some fragments, on the other hand, it can be more efficient to buffer the data, and then transmit the data when the buffer is full (reaching PMTU).If, in some cases, the L4 layer wants the data to be placed in the buffer to be transmitted immediately, the Ip_push_pending_frames is called immediately after the call to Ip_append_data puts the data in the buffer.
Ip_append_dataIp_append_data mainly has the following tasks: 1. Organize the buffer. The packet data of the L4 layer is organized into buffers, so that these buffers can better process the fragments. It also makes it easier for L2 and L3 to add headers later. 2. Optimize memory allocation. This takes into account the upper layer protocol information, as well as the transmission capability of the device exit. 3. Handling L4 inspection and.
Ip_append_data This part of the content has not fully understood, recently did not have time to look closely, later has the free again to update, first mark under.
the transmission of the IPV4 message is finally called Dst_output, and then the intro call Ip_finish_output2 interacts with the neighbor subsystem. Finally, call Dev_queue_xmit to pass the datagram to the device driver.
In-depth understanding of Linux network Technology the transmission of inside--IPV4 messages