Design of TCP packet capture segmentation and reorganization
Function
-------
The TCP packet segment is out of order, repeated, and packet loss occurs in packet capture.
Before analyzing the upper layer protocol, you need to reorganize the TCP packets.
Segment reorganization re-sorts TCP data, drops duplicated data in order, and indicates data loss.
Input
-------
Reorganization only processes one-way data streams. Therefore, a TCP connection must process data streams in both directions.
The restructured data is assumed to have been checked for verification.
The size of the TCP window is ignored after packet capture and reorganization.
To put it simply, reorganization only cares about TCP serial numbers, response numbers, and data. There are several special TCP signs.
Special TCP flags include SYN, ack, RST, and fin. Special processing is required.
A start sequence number is required for restructuring, which is obtained from the SYN packet. The ACK flag indicates that the response sequence number is valid.
RST and fin set the end mark of the stream. The stream is closed only when all data is received.
Closing a TCP connection can only be determined by closing a two-way stream.
Output
-------
The reorganization does not process the upper layer protocol, because the upper layer protocol processing needs to combine two-way data streams, or even multiple TCP streams.
Cache only, and only push/pop-up data. The pop-up indicates that the cached data is deleted after processing.
The output is the data segment after the reorganization. The pop-up data does not need to be merged in segments, so the original segments can be maintained.
If strict TCP reorganization is implemented, acknowledged should pop up after the data is confirmed.
This is not the case in packet capture data processing. As long as the serial number is continuous, data is processed directly without being cached.
This is due to the following considerations:
* Waiting for confirmation, the cache is required. Most TCP packets are continuous and do not need to be sorted. Direct Processing can greatly improve the efficiency.
* The data will be answered and confirmed later. The data may not be confirmed only when TCP is interrupted. This will not affect the processing.
* For possible one-way data stream packet capture analysis, only the validation number can be ignored.
(In this case, packet loss may cause a large number of segments to be cached, and the number of segments to be cached must be limited .)
Interface Definition
----------
* Cache a TCP segment.
If the TCP segment is out of order, it must be cached.
In most cases, there is no out-of-order mechanism, so you do not need to cache it. You can directly analyze the upper-layer protocols.
The return Boolean value indicates whether the data is cached.
* The next TCP segment is displayed.
After accepting a TCP segment or confirming a sequence number,
There may be multiple TCP segment data that can be popped up for upper-layer protocol analysis.
The pop-up operation must be deleted, so the pop-up operation may be a combination operation,
Such as determining whether data exists, retrieving data, deleting data, and the next data.
A lost TCP segment may pop up, indicating that the data of a segment has been lost.
* Confirm the sequence number.
Confirm the serial number that has not arrived to indicate data loss.
Packet loss will cause the subsequent TCP segments to enter the cache. If you confirm the packet loss, you can skip the lost sequence number.
* Start, end, and reset a sequence number.
Corresponding to the SYN/FIN/rst flag.
* Whether it is disabled.
One TCP close = both the two-way data streams are closed.
To disable one-way data streams, you can check whether the FIN flag is received,
Check whether all data before the FIN flag number has been received, that is, the fin package may be ahead of schedule.
* Force end. When packet loss occurs in the TCP End packet, a TCP connection needs to be forcibly ended.
To forcibly end a data stream, All cached data pops up.
Sample Code:
Class ctcpsegments
{
Public:
Void Syn (u_int32_t SEQ );
Void ack (u_int32_t ACK );
Void RST (u_int32_t SEQ );
Void fin (u_int32_t SEQ );
Bool push (const u_char * data,
Unsigned int Len,
U_int32_t SEQ );
// Maybe several methods: canpop (), gettop (), deltop (), next ()...
Const tcpsegment_t & POP ();
Void forcefin (); // ack all the buffered and set fin
Bool isclosed () const;
}
Example
-----------
Suppose there is a ctcpconnection object that processes the TCP stream on an IP: port address.
Ctcpconnection has two ctcpsegments objects, which are used to reorganize the TCP data streams of the client and server respectively.
Class ctcpconnection
{
...
Ctcpsegments m_clt, m_svr;
}
For example, for TCP packet data sent from a client to the server, the process is as follows:
1. process the server data to be answered
M_svr.acknowledge (NACK );
While (m_svr.canpop ())
Dealsvrdata (m_svr.pop ());
2. process the data of the current TCP packet
M_clt.push (...);
While (m_clt.canpop ())
Dealcltdata (m_clt.pop ());
3. process the FIN/rst flag
If (tcp_fin & ctcpflags)
M_clt.fin (nseq );
If (tcp_rst & ctcpflags)
M_clt.rst (nseq );