TCP: Transmission Control Protocol (1)

Source: Internet
Author: User

For more information about relevant protocols, see TCP/IP study notes (8) TCP transmission control protocol

Management of TCP transmission control blocks, set Interface Options, ioctl, error handling, and cache management involve the following files:

Include/Linux/tcp. h defines the format of TCP segments, structure of TCP transmission control blocks, macro and function prototype

Include/NET/sock. h defines the basic transmission control block structure, Macro, and function prototype

Include/NET/inet_connection_sock.h defines connection request blocks and other related interfaces, macros and functions

Include/NET/inet_hashtables.h defines the hash list for managing Transmission Control Blocks

Net/IPv4/af_inet.c network layer and transport layer Interfaces

Net/IPv4/tcp_ipv4.c interface implementation between the transmission control block and the network layer

Net/IPv4/tcp. C interface between the transmission control block and the Application Layer

Net/CORE/stream. c TCP streaming memory management implementation

TCP transmission control block

TCP transmission control blocks play a core role in the entire TCP process, including establishing connections, data transmission, congestion control, and connection termination. During the entire TCP connection process, the following three types of TCP transmission control blocks are used in sequence:

1. The first type is tcp_request_sock. It is used during the connection establishment, and the existing time is relatively short.

2. The second type is tcp_sock, Which is used before the connection is established and the TCP status is established. This transmission control block has the longest declaration cycle, and both the sending and receiving segments need to be controlled.

3. The third type is tcp_timewait_sock, which is used to terminate the connection and has a short process.

Inet_connection_sock_af_ops

It encapsulates a set of operation sets related to the transport layer, including interfaces sent to the network layer and setsockopt interfaces of the transport layer. The TCP instance is ipv4_specific.

Tcp_options_received

It is mainly used to save received TCP option information, such as timestamp and sack. It also marks features supported by the peer end, such as whether the peer end supports window expansion factor and ack.

Tcp_skb_cb

The TCP layer has a private information control block in the SKB area, namely the CB Member of the skb_buff structure. TCP uses this field to store a tcp_skb_cb structure. On the TCP layer, use the macro tcp_skb_cb to access this information block to enhance code readability. The assignment of this private information control block is generally performed before the receiving or sending segment of this layer. For example, tcp_v4_rcv () is the receiving entry function of the TCP layer. When a TCP segment is received and necessary verification is performed on it, tcp_skb_cb of the segment is set. In the process of sending, most of them are set when the TCP segment is generated or when the TCP segment is segmented.

TCP Initialization

The initialization function tcp_init () of the transport layer TCP module is called by the initialization function inet_init () of the IPv4 protocol family.

void __init tcp_init(void){struct sk_buff *skb = NULL;unsigned long nr_pages, limit;int i, max_share, cnt;BUILD_BUG_ON(sizeof(struct tcp_skb_cb) > sizeof(skb->cb));percpu_counter_init(&tcp_sockets_allocated, 0);percpu_counter_init(&tcp_orphan_count, 0);tcp_hashinfo.bind_bucket_cachep =kmem_cache_create("tcp_bind_bucket",  sizeof(struct inet_bind_bucket), 0,  SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);/* Size and allocate the main established and bind bucket * hash tables. * * The methodology is similar to that of the buffer cache. */tcp_hashinfo.ehash =alloc_large_system_hash("TCP established",sizeof(struct inet_ehash_bucket),thash_entries,(totalram_pages >= 128 * 1024) ?13 : 15,0,&tcp_hashinfo.ehash_size,NULL,thash_entries ? 0 : 512 * 1024);tcp_hashinfo.ehash_size = 1 << tcp_hashinfo.ehash_size;for (i = 0; i < tcp_hashinfo.ehash_size; i++) {INIT_HLIST_NULLS_HEAD(&tcp_hashinfo.ehash[i].chain, i);INIT_HLIST_NULLS_HEAD(&tcp_hashinfo.ehash[i].twchain, i);}if (inet_ehash_locks_alloc(&tcp_hashinfo))panic("TCP: failed to alloc ehash_locks");tcp_hashinfo.bhash =alloc_large_system_hash("TCP bind",sizeof(struct inet_bind_hashbucket),tcp_hashinfo.ehash_size,(totalram_pages >= 128 * 1024) ?13 : 15,0,&tcp_hashinfo.bhash_size,NULL,64 * 1024);tcp_hashinfo.bhash_size = 1 << tcp_hashinfo.bhash_size;for (i = 0; i < tcp_hashinfo.bhash_size; i++) {spin_lock_init(&tcp_hashinfo.bhash[i].lock);INIT_HLIST_HEAD(&tcp_hashinfo.bhash[i].chain);}cnt = tcp_hashinfo.ehash_size;tcp_death_row.sysctl_max_tw_buckets = cnt / 2;sysctl_tcp_max_orphans = cnt / 2;sysctl_max_syn_backlog = max(128, cnt / 256);/* Set the pressure threshold to be a fraction of global memory that * is up to 1/2 at 256 MB, decreasing toward zero with the amount of * memory, with a floor of 128 pages, and a ceiling that prevents an * integer overflow. */nr_pages = totalram_pages - totalhigh_pages;limit = min(nr_pages, 1UL<<(28-PAGE_SHIFT)) >> (20-PAGE_SHIFT);limit = (limit * (nr_pages >> (20-PAGE_SHIFT))) >> (PAGE_SHIFT-11);limit = max(limit, 128UL);limit = min(limit, INT_MAX * 4UL / 3 / 2);sysctl_tcp_mem[0] = limit / 4 * 3;sysctl_tcp_mem[1] = limit;sysctl_tcp_mem[2] = sysctl_tcp_mem[0] * 2;/* Set per-socket limits to no more than 1/128 the pressure threshold */limit = ((unsigned long)sysctl_tcp_mem[1]) << (PAGE_SHIFT - 7);max_share = min(4UL*1024*1024, limit);sysctl_tcp_wmem[0] = SK_MEM_QUANTUM;sysctl_tcp_wmem[1] = 16*1024;sysctl_tcp_wmem[2] = max(64*1024, max_share);sysctl_tcp_rmem[0] = SK_MEM_QUANTUM;sysctl_tcp_rmem[1] = 87380;sysctl_tcp_rmem[2] = max(87380, max_share);printk(KERN_INFO "TCP: Hash tables configured "       "(established %d bind %d)\n",       tcp_hashinfo.ehash_size, tcp_hashinfo.bhash_size);tcp_register_congestion_control(&tcp_reno);}

Management of TCP transmission control blocks

After a TCP transmission control block is successfully created, it needs to be properly managed. There are multiple states in TCP, and some States may exist for a short time. During the interaction between the two ends of TCP, these statuses will soon be migrated to another State. In contrast, the listen and established statuses are normal. In order to properly manage and access transmission control blocks in different states, TCP stores the transmission control blocks to multiple scattered lists based on the status.

Inet_hashinfo

struct inet_hashinfo {/* This is for sockets with full identity only.  Sockets here will * always be without wildcards and will have the following invariant: * *          TCP_ESTABLISHED <= sk->sk_state < TCP_CLOSE * * TIME_WAIT sockets use a separate chain (twchain). */struct inet_ehash_bucket*ehash;spinlock_t*ehash_locks;unsigned intehash_size;unsigned intehash_locks_mask;/* Ok, let's try this, I give up, we do need a local binding * TCP hash as well as the others for fast bind/connect. */struct inet_bind_hashbucket*bhash;unsigned intbhash_size;/* 4 bytes hole on 64 bit */struct kmem_cache*bind_bucket_cachep;/* All the above members are written once at bootup and * never written again _or_ are predominantly read-access. * * Now align to a new cache line as all the following members * might be often dirty. *//* All sockets in TCP_LISTEN state will be in here.  This is the only * table where wildcard'd TCP sockets can exist.  Hash function here * is just local port number. */struct inet_listen_hashbucketlistening_hash[INET_LHTABLE_SIZE]____cacheline_aligned_in_smp;atomic_tbsockets;};
Struct inet_ehash_bucket * ehash;

Unsigned int ehash_size;

Ehash points to a hash list of the inet_ehash_bucket structure type with the size of ehash_size, which is used to manage the scattered list of transmission control blocks except listen in TCP status.

Struct inet_ehash_bucket {
Struct hlist_nulls_head chain;
Struct hlist_nulls_head twchain;
};
Chain and twchain are used for link transmission control blocks

Struct inet_bind_hashbucket * bhash;

Unsigned int bhash_size;
The bhash hash list with the size of bhash_size is mainly used to store the information of the bound port.

Struct inet_bind_hashbucket {
Spinlock_t lock;
Struct hlist_headchain;
};

Chain is used to create a port binding information block

Struct inet_listen_hashbucket listening_hash [inet_lhtable_size];

Lists the transmission control blocks used to manage the listen status.

After a transfer control block is created, the hash interface of the transport interface layer is called to add the transfer control block to the ehash Hash hash until the transfer control block is released. In TCP, the function for implementing the hash interface is tcp_v4_hash ().

When a transmission control block is not required, the unhash interface of the transport interface layer is called to delete the transmission control block from the ehash hash Hash hash. In TCP, the function that implements the hash interface is tcp_unhash ().

After the listen system call is called, the set of interfaces will enter the listen state, and _ inet_hash () will be called to add the transmission control block to the listening_hash Hash hash list, this allows you to quickly find interfaces in the listener status.

TCP: Transmission Control Protocol (1)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.