Analysis of implementation principle of TCP Segment offload (TSO)

Source: Internet
Author: User

The morning is too hot, suddenly remembered three weeks ago someone talked to me about Tso's problem, I also described its principle, this principle is also particularly simple, is to rely on the network card hardware to segment, calculate checksum, thereby freeing the CPU cycle. In fact, as long as said one is enough, since relying on hardware to segment, then only by the hardware to calculate checksum, because you do not know the hardware of the segmented details, so you can not calculate in the segment before the checksum of each segment ....
The principle of TSO almost everyone knows, in fact it is how to achieve this problem is not difficult, difficult is the details. After doing the work, I would like to show this principle, of course, and the actual implementation of the super-large, however, it is a schematic diagram, carefully observed, should also be able to achieve a better than my TSO.
This design is a digital logic, sequential circuit of the category, and this field is very large, not ordinary software programmers can hold, like me, half a bottle is the same. So I'm still trying to give a result, rather than asking the shuwang to do something beforehand, often when people do the preparation, they're tired of giving up.
The basic knowledge is not difficult, is some door circuit, and door, non-gate, comparator, decoder, trigger and so on, these things casually find a computer composition principle, are very complete. The key is how to combine them, which is another area of programming. At this point, I think about 15 years ago, my high school physics old wet danqing in the circuit said a word: let the current flow. This sentence in the language of the study is completely incompatible with the basic principles of circuit design, they may be more inclined to model first, then analyze, and then use the description language VHDL write code, and finally give the circuit, I think this is suitable for the design itself, but not suitable for a layman to tell its wonderful. For a layman, the only thing he knows is to let the current flow, and then run through the door, rushing through the tube, OK, the high level becomes low level .... In my opinion, that's what happened.
On a piece of white paper, draw a bunch of gate circuits, and then randomly combine them, slowly, I suddenly found that this circuit is the framework of TSO. I remember helping people fix the routing forwarding last week, but that kind of curing behavior could be too expensive to pass off, after all, now the soft implementation is enough. So only the core transmission network needs this curing of the forwarding, but TSO is the server domain of the first push, server too much, far more than the core forwarding equipment, their CPU needs to lighten, indeed, the CPU to calculate some fixed mode of things, a little wasted, it should spend more energy to deal with some of the uncontrolled things. So the TCP segment is naturally done by the NIC. You, me, he, we've all met Tso, but we'll just turn it on, turn it off, if you want to know exactly how it offload, see, let the current flow first-class:



The difference between TCP segment and IP shard is very big, you must understand this matter. Then you can read the diagram above.
The above analysis is only a special case, in fact, all the hardware acceleration mechanism is nothing more than the same mechanism. When I was looking at the Intel Gigabit/Gigabit Network card Manual, I thought in the chip inside, this kind of circuit component is almost massive, realizes the RSS, the hardware hash classification and so on. This is what I call the flood of rivers, along the gully instantly devour the Earth, we how to dig ditch fill, this is not the purpose of this article, this article only describes this possibility. This is also the essential difference between this dedicated circuit and the universal CPU. The CPU has a set of instructions, which means that it is concerned about how the external call, and the focus of the dedicated circuit is the internal execution logic, it almost does not provide any interface, the only thing is to set a number of register values, such as MTU, packet length, data header length, and other execution logic, external not authorized to intervene. This is the essential difference between serial programming and parallel execution.
For the instruction system, there is something to say. In the internal control logic, there is a unified instruction distribution system, is actually emitted a series of 0 and 1 of the combination, the combination of 0 and 1 acting on various gate circuits, these gate circuits accept these different inputs, produce different output, and then as the input of the other gate circuit, resulting in different output, So repeated ... Isn't that the truth? It's hard for you to decide otherwise.
Let the current flow first-class, if you feel more abstract, then observe the flood of the process, the river burst the location of different, resulting in different disasters, the key lies in the slope of the topography and its connected to the various terrain, this is happening at the same time, and the same as the current, the flow through a bend or an arch bridge, There will also be some delay and shunt, which can be analogous to the various gates in the circuit.
Eat, eat, really annoying!

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Analysis of implementation principle of TCP Segment offload (TSO)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.