2.6.1 * TCP Connection Tracking in Linux Kernel

Source: Internet
Author: User
Tags rfc
Copyleft of this document belongs to yfydz and can be freely copied and reproduced when published using GPL. It is strictly prohibited to be used for any commercial purposes.
MSN: yfydz_no1@hotmail.com

Source: http://yfydz.cublog. cn1. preface in linux kernels later than 2.6.1 *, a large modification has been made to TCP Connection Tracking and the check of TCP flag combinations has been added; the function of judging the validity of data packets by serial number, confirmation number, and window value is added, and the SACK option is supported. The status conversion array has also been modified and improved, and the amount of corresponding program code has increased a lot. The following 2.6 kernel code version is 2.6.17.11. 2. Through the confirmation number, serial number and window to determine the validity of the packet this idea is proposed early, initially in the "real statefule TCP packet filtering in IP Filter" (http://www.nluug.nl/events/sane2000/papers.html)
), Used in the firewall IP Filter of FreeBSD, OpenBSD, NetBSD, and other operating systems. Principle: three handshakes are performed at the beginning of the TCP connection to exchange MSS and other information. The window field also tells the peer's data receiving buffer size, and the other party cannot send data at one time.
Data that exceeds this size, that is, the change value of the serial number of one party cannot exceed the window size provided by the other party. The change value of the confirmation number cannot exceed the window size provided by the other party. This is a normal TCP
If this condition is not met, the packet is invalid. When using this function, pay attention to two TCP options. First, the sack (selective validation) Options of TCP, rfc1323, 8,8, 2883. When data packets are lost, enable the sender to resend the lost packet instead of sending all packets. Second, expand the Window option, which can extend the window value from 16 bits to 30 bits. A new data structure is added to describe this function:/* include/Linux/Netfilter/nf_conntrack_tcp.h */struct ip_ct_tcp_state {
U_int32_t td_end;/* max of seq + Len */
U_int32_t td_maxend;/* max of ack + max (Win, 1 )*/
U_int32_t td_maxwin;/* max (WIN )*/
U_int8_t td_scale;/* Window Scale Factor */
U_int8_t loose;/* used when connection picked up from the middle */
U_int8_t flags;/* per direction options */
}; The function used to determine whether a TCP packet serial number and validation number are in the specified window range is tcp_in_window: static int tcp_in_window (struct ip_ct_tcp * state,
Enum ip_conntrack_dir,
Unsigned int index,
Const struct sk_buff * SKB,
Struct iphdr * IPH,
Struct tcphdr * tcph)
{
Struct ip_ct_tcp_state * sender = & State-> seen [dir];
Struct ip_ct_tcp_state * receiver = & State-> seen [! Dir];
_ U32 seq, ack, Sack, end, win, swin;
Int res;
 
// The first SYN Packet sent by the client cannot reach this function, and it is accepted directly,
// This function will be processed only after the 2nd packages are connected.
/*
* Get the required data from the packet.
*/
// Serial number
SEQ = ntohl (tcph-> SEQ );
// Confirmation number
ACK = sack = ntohl (tcph-> ack_seq );
// Local window
Win = ntohs (tcph-> window );
// End serial number of the packet
End = segment_seq_plus_len (SEQ, SKB-> Len, IPH, tcph );
// If the receiver supports sack, check whether there is sack in the TCP option.
If (Cycler-> flags & ip_ct_tcp_flag_sack_perm)
Tcp_sack (SKB, IPH, tcph, & Sack); // The ellipsis (...) contains debugging printing information, which is ignored.
...
If (sender-> td_end = 0 ){
// Initial connection status
/*
* Initialize sender data.
*/
If (tcph-> SYN & tcph-> ACK ){
// Server
/*
* Outgoing SYN-ACK in reply to a syn.
*/
Sender-> td_end =
Sender-> td_maxend = end;
Sender-> td_maxwin = (WIN = 0? 1: Win );
// Check TCP options to determine whether the receiver supports sack and window Expansion
Tcp_options (SKB, IPH, tcph, Sender );
/*
* RFC 1323:
* Both sides must send the Window Scale Option
* To enable Window Scaling in either direction.
*/
If (! (Sender-> flags & ip_ct_tcp_flag_window_scale
& Receiver-> flags & ip_ct_tcp_flag_window_scale ))
// Window expansion is not supported
Sender-> td_scale =
Extends er-> td_scale = 0;
} Else {
/*
* We are in the middle of a connection,
* Its history is lost for us.
* Let's try to use the data from the packet.
*/
Sender-> td_end = end;
Sender-> td_maxwin = (WIN = 0? 1: Win );
Sender-> td_maxend = end + sender-> td_maxwin;
}
} Else if (State-> state = tcp_conntrack_syn_sent
& Dir = ip_ct_dir_original)
| (State-> state = tcp_conntrack_syn_recv
& Dir = ip_ct_dir_reply ))
& After (end, sender-> td_end )){
// The sender resends the packet
/*
* RFC 793: "If a TCP is reinitialized... then it need
* Not wait at all; it must only be sure to use sequence
* Numbers larger than those recently used ."
*/
Sender-> td_end =
Sender-> td_maxend = end;
Sender-> td_maxwin = (WIN = 0? 1: Win); tcp_options (SKB, IPH, tcph, Sender );
}
 
// Set the validation number to the end serial number of the receiver for non-ack and RST packets
If (! (Tcph-> ACK )){
/*
* If there is no ACK, just pretend it was set and OK.
*/
ACK = sack = Receiver-> td_end;
} Else if (tcp_flag_word (tcph) & (tcp_flag_ack | tcp_flag_rst) =
(Tcp_flag_ack | tcp_flag_rst ))
& (ACK = 0 )){
/*
* Broken TCP stacks, that set ack in RST packets as well
* With zero ack value.
*/
ACK = sack = Receiver-> td_end;
} // No data packet or start package
If (SEQ = end
&&(! Tcph-> RST
| (SEQ = 0 & State-> state = tcp_conntrack_syn_sent )))
/*
* Packets contains no data: we assume it is valid
* And check the ACK value only.
* However RST segments are always validated by their
* Seq number, when T when seq = 0 (Reset sent answering
* Syn.
*/
SEQ = END = sender-> td_end;

... // Check whether the serial number and validation number are within the valid range
If (sender-> loose | Cycler-> loose |
(Before (SEQ, sender-> td_maxend + 1 )&&
After (end, sender-> td_end-Cycler-> td_maxwin-1 )&&
Before (sack, javaser-> td_end + 1 )&&
After (ACK, Cycler-> td_end-maxackwindow (sender )))){
// Valid package
/*
* Take into account Window Scaling (RFC 1323 ).
*/
// Adjust window Expansion
If (! Tcph-> SYN)
Win <= sender-> td_scale;

/*
* Update sender data.
*/
// Adjust the sender's window
Swin = Win + (sack-ack );
If (sender-> td_maxwin <swin)
Sender-> td_maxwin = swin;
If (after (end, sender-> td_end ))
Sender-> td_end = end;
/*
* Update receiver data.
*/
// Adjust parameters of the receiver
If (after (end, sender-> td_maxend ))
Extends er-> td_maxwin + = end-sender-> td_maxend;
If (after (sack + win, Cycler-> td_maxend-1 )){
Extends er-> td_maxend = sack + win;
If (WIN = 0)
Extends er-> td_maxend ++;
}/*
* Check retransmissions.
*/
// Determine whether it is a resend Packet
If (Index = tcp_ack_set ){
If (State-> last_dir = dir
& State-> last_seq = seq
& State-> last_ack = ACK
& State-> last_end = end)
State-> retrans ++;
Else {
State-> last_dir = dir;
State-> last_seq = seq;
State-> last_ack = ack;
State-> last_end = end;
State-> retrans = 0;
}
}
/*
* Close the window of disabled window tracking
*/
If (sender-> loose)
Sender-> loose --;

Res = 1;
} Else {
...
// Default policy for invalid packets. The value 0 indicates that the request is rejected. If the value is not 0, the Default policy can be set through the/proc file system.
Res = ip_ct_tcp_be_liberal;
}
...
Return res;
} 3. TCP status conversion table this is the new conversion table in 2.6.1 *: static const Enum tcp_conntrack tcp_conntracks [2] [6] [tcp_conntrack_max] = {
{
/* Original */
/* SnO, SSS, SSR, SES, SFW, SCW, SLA, STW, SCL, SLI */
/* SYN */{SSS, SSS, Sig, SSS, SSS, Siv },
/* Synack */{SIV, Siv, Siv },
/* Fin */{SIV, Siv, SFW, SFW, SLA, SLA, STW, SCL, Siv },
/* Ack */{SES, Siv, SES, SES, SCW, SCW, STW, STW, SCL, Siv },
/* Rst */{SIV, check, Siv },
/* None */{SIV, Siv, Siv}
},
{
/* Reply */
/* SnO, SSS, SSR, SES, SFW, SCW, SLA, STW, SCL, SLI */
/* SYN */{SIV, Siv, Siv },
/* Synack */{SIV, SSR, SSR, Sig, Siv },
/* Fin */{SIV, Siv, SFW, SFW, SLA, SLA, STW, SCL, Siv },
/* Ack */{SIV, Sig, SSR, SES, SCW, SCW, STW, STW, SCL, Siv },
/* Rst */{SIV, check, Siv },
/* None */{SIV, Siv, Siv}
}
}; This is the conversion table in the previous 2.4.26
Static Enum tcp_conntrack tcp_conntracks [2] [5] [tcp_conntrack_max] = {
{
/* Original */
/* SnO, SES, SSS, SSR, SFW, STW, SCL, SCW, SLA, SLI */
/* SYN */{SSS, SES, SSS, SSR, SSS, SLI },
/* Fin */{STW, SFW, SSS, STW, SFW, STW, SCL, STW, SLA, SLI },
/* Ack */{SES, SES, SSS, SES, SFW, STW, SCL, SCW, SLA, ses },
/* Rst */{inside, check, SSS, check, check, STW, check, and check },
/* None */{SIV, Siv, Siv}
},
{
/* Reply */
/* SnO, SES, SSS, SSR, SFW, STW, SCL, SCW, SLA, SLI */
/* SYN */{SSR, SES, SSR, SSR },
/* Fin */{SCL, SCW, SSS, STW, SCL, SCW, SLA, SLI },
/* Ack */{SCL, SES, SSS, SSR, SFW, STW, SCL, SCW, SCL, SLI },
/* Rst */{inside, check, SLA, SLI },
/* None */{SIV, Siv, Siv}
}
}; The interpretation method of this array has been introduced in previous articles and will not be repeated. From the comparison of the two arrays, we can see that the synack package judgment is added, and the SIV (invalid state) items in the array are also added, so that the status tracking is stricter, however, the disadvantage is that the ACK package is still too tolerant, and the Ack Scan cannot be prevented. 4. the valid combination of TCP valid flag and TCP flag is defined by the following array. Each element of the array is a possible combination. In addition to defining the valid combination item as 1, all other undefined items are invalid. The value is 0. Static const u8 tcp_valid_flags [(th_fin | th_syn | th_rst | th_push | th_ack | th_urg) + 1] =
{
[Th_syn] = 1,
[Th_syn | th_ack] = 1,
[Th_syn | th_push] = 1,
[Th_syn | th_ack | th_push] = 1,
[Th_rst] = 1,
[Th_rst | th_ack] = 1,
[Th_rst | th_ack | th_push] = 1,
[Th_fin | th_ack] = 1,
[Th_ack] = 1,
[Th_ack | th_push] = 1,
[Th_ack | th_urg] = 1,
[Th_ack | th_urg | th_push] = 1,
[Th_fin | th_ack | th_push] = 1,
[Th_fin | th_ack | th_urg] = 1,
[Th_fin | th_ack | th_urg | th_push] = 1,
}; 5. Four parameters related to the Netlink interface are added to the new protocol trace structure struct ip_conntrack_protocol Of The Netlink interface, which is used to transmit tracing protocol information through the Netlink interface. /* Convert protoinfo to nfnetink attributes */
INT (* to_nfattr) (struct sk_buff * SKB, struct nfattr * NFA,
Const struct ip_conntrack * CT);/* convert nfnetlink attributes to protoinfo */
INT (* from_nfattr) (struct nfattr * TB [], struct ip_conntrack * CT); int (* tuple_to_nfattr) (struct sk_buff * SKB,
Const struct ip_conntrack_tuple * t );
INT (* nfattr_to_tuple) (struct nfattr * TB [],
Struct ip_conntrack_tuple * t); in TCP, the corresponding function is:. to_nfattr = tcp_to_nfattr,
. From_nfattr = nfattr_to_tcp,
. Tuple_to_nfattr = ip_ct_port_tuple_to_nfattr,
. Nfattr_to_tuple = ip_ct_port_nfattr_to_tuple, 6. Conclusion 2.6.1 * the TCP protocol tracking process is much higher than 2.4. The use of these new functions further improves the security of the system.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.