Identification of Layer7 data streams (Connection Tracking) in Linux Streaming Server Load balancer
1. nf_conntrack supporting Layer7 is really unnecessary
After the fire, you will feel the need to quickly change "data streams based on quintuple" to "data streams with fixed offset based on the application layer protocol". The sooner the better! Therefore, this person added several fields for nf_conn in the Linux 3.17 kernel that supports zone conntrack:
Bool l7; // boolean type, indicating whether to match layer7.
U32 offset; // the offset of the stream ID at the application layer
U32 offlen; // length of the application layer Stream ID
The above three fields are set in CT target, and the configured zone also indicates:
All data packets belonging to the zone $ id are identified by a fixed-length Stream id defined by the application layer's fixed offset, instead of a traditional quintuple. Redefine tuple and add a bool l7 to indicate whether it is the stream ID at the application layer. At the same time, add an array sid of MAX_IDLEN length, which means that the maximum Stream ID is MAX_IDLEN bytes.
The above is the basic data definition, so it is not difficult to modify the code logic. It is mainly to modify the resolve_normal_ct function and retrieve the l7 in the tmpl template. If it is not 0, it indicates that the "Application Layer Stream ID" is required to identify the stream. In this case, the [iphdr + iphdrlen + transphdrlen] location is located based on the offset and offlen fields to retrieve the offlen bytes of data, calculate the hash value as the key of the hash calculation. Before _ nf_conntrack_find_get, tuple is filled into the sid of the application layer and the l7 of tuple is set. This means that when you find conntrack, compare the tuple sid value instead of the quintuple. Finally, at the time of conn confirm, the sid indicated by the payload information indicated by the conntrack according to its offset and offlen (it has been put into the tuple structure and its char sid [MAX_IDLEN]; field.
It took less than two hours to modify, compile, and test (the iMac I bought was so powerful !!). Casual, play, eat something, drink tea, and get started. This person is me!
After thinking about the meaning of what you do, it is also a reflection process! I suddenly found that everything I did was meaningless. The conntrack struct does not store any information for the application layer. Although I have extended it, it can store many things, such as routing, socket, etc, but there is actually nothing to use, that is, these are all things that you have nothing to play. The most important information stored in conntrack is NAT information, that is, tuple information. This tuple is based on the traditional 5-tuples, if I use the application layer Information Based on sessionID to identify a tuple, what should I do with NAT? If the IP address of the client changes, even if the sessionID remains unchanged, NAT still needs to be re-implemented, but it still has no benefit. My intention is to save the series of re-operations due to changes in the IP address and port, but it is still not saved because the IP address and port are changed, the IP address and port information must be modified or modified again.
If the above Code is written on paper, it is clear that I will tear it apart and then throw it into the trash can...
2. The reuseport supporting Layer7 arbitrary payload hash calculation is powerful.
The latest Linux kernel already supports the UDP reuseport option. This mechanism can be used for UDP load balancing. If you do not know about it, You can bing it. It can perform load balancing, that is, it calculates a fixed hash through a fixed 5-tuples, and then distributes a packet to a fixed socket based on this fixed hash, if the IP address does not change, everything will be fine, but the IP address will change in the mobile environment, which means the 5-tuples information has changed, the re-calculated hash will also change (if it is not changed, it will be a collision !), This means that the next UDP packet sent by the IP Client may be distributed to another socket, which is not expected in the UDP-based persistent connection service. The core code of _ udp4_lib_lookup is as follows:
Begin:
Result = NULL;
Badness =-1;
Sk_nulls_for_each_rcu (sk, node, & hslot-> head ){
// For hash calculation of sessionID, the server should not identify sport/saddr as a wonderful one!
Score = compute_score (sk, net, saddr, hnum, sport,
Daddr, dport, dif );
If (score> badness ){
Result = sk;
Badness = score;
Reuseport = sk-> sk_reuseport;
If (reuseport ){
// The 5-tuples stream version. Calculate a hash value based on the 4-tuples.
// Hash = inet_ehashfn (net, daddr, hnum, saddr, htons (sport ));
// Sid stream version, which is calculated based on sessionID.
// The problem is how to upload this sid here... overhaul it.
Hash = sid_based_hash (sid ,);
Matches = 1;
}
} Else if (score = badness & reuseport ){
Matches ++;
// Whether the last matched sk is replaced by this sk depends on the influence of the hash value.
If (u64) hash * matches)> 32 = 0 ){
Result = sk;
}
Hash = hash * 1664525 + 1013904223;
}
}
/*
* If the nulls value we got at the end of this lookup is
* Not the expected one, we must restart lookup.
* We probably met an item that was moved to another chain.
*/
If (get_nulls_value (node )! = Slot)
Goto begin;
The comment mentions overhaul, which means that I must upload a skb here to obtain the sid Based on the reuseport flag of setsockopt, the sid offset, and the sid offlen, and then calculate the hash, but this repair is very easy. Just re-compile the kernel.
It is nice to use sessionID in UDP reuseport to identify a stream, because at this time the data has been transferred to the transport layer, in addition to the re-encapsulated data packets, basically all of them reach a UDP Service on the local machine, and the data packet has arrived here, which means that the identification of 5-tuples, such as NAT, has passed completely. The next step is to send data to the application layer, at this time, we can identify a stream based on the sid of the application layer to ensure that even if the client IP address changes, the request can reach the same UDP Service thread... this also provides a good real-world solution for the mobile age. In the age of frequent replacement of quintuple groups, how can we keep the application layer open...
This article permanently updates the link address: