System.IO.Pipelines:. NET on high performance IO

Source: Internet
Author: User
Tags dotnet eol

System.IO.Pipelinesis a new library designed to simplify the. The process of performing high-performance IO in net. It is a library that relies on. NET standard for all. NET implementation .

Pipelines was born in the. NET core team to make Kestrel one of the fastest Web servers in the industry. Originally developed as a reusable API from implementation details as internal to kestrel, it is available for all in. Net Core 2.1. NET developer with the highest level BCL API (System.IO.Pipelines).

What is it that solves the problem?

In order to properly parse the data in a stream or socket, the code has a fixed template, and there are many extreme cases in which you have to write complex code that is difficult to maintain.
Achieve high performance and correctness, but also difficult to handle this complexity. Pipelines is designed to address this complexity.

How complex is it?

Let's start with a simple question. We want to write a TCP server that receives line-delimited messages (delimited by) from the client \n . (Translator Note: A line is a message)

TCP Server using NetworkStream

Disclaimer: As with all performance-sensitive work, you should measure the reality of each scenario in your application. Depending on the scale of your network application needs to be processed, you may not need to care about the overhead of various technologies.

The typical code written in. NET before pipelines is as follows:

async Task ProcessLinesAsync(NetworkStream stream){    var buffer = new byte[1024];    await stream.ReadAsync(buffer, 0, buffer.Length);        // 在buffer中处理一行消息    ProcessLine(buffer);}

This code may work correctly when tested locally, but it has several potential errors:

    • One ReadAsync call may not receive the entire message (end of line).
    • It ignores the stream.ReadAsync() amount of data that is actually populated in the return value buffer . (The Translator notes: that is not necessarily filled buffer )
    • A ReadAsync call cannot handle more than one message.

These are some of the common pitfalls of reading streaming data. To solve this problem, we need to make some changes:

    • We need to buffer the incoming data until a new row is found.
    • We need to parse all the rows returned in the buffer
Async Task Processlinesasync (NetworkStream stream) {var buffer = new byte[1024];    var bytesbuffered = 0;    var bytesconsumed = 0; while (true) {var bytesread = await stream. Readasync (buffer, bytesbuffered, buffer.        length-bytesbuffered);        if (bytesread = = 0) {//EOF has reached the end of break;                }//Track the number of buffered bytes bytesbuffered + = Bytesread;        var lineposition =-1; Do {//Find a line at the end of the buffered data Lineposition = array.indexof (buffer, (byte) ' \ n ', bytesconsumed, Bytesb            uffered-bytesconsumed); if (lineposition >= 0) {//calculates the length of a row according to the offset var linelength = Lineposition-bytes                consumed;                Handle this line processline (buffer, bytesconsumed, linelength);            Move bytesconsumed in order to skip the lines we have disposed of (including \ n) bytesconsumed + = linelength + 1;    }} while (Lineposition >= 0); }}

This time, this may apply to local development, but a row may be larger than 1KiB (1024 bytes). We need to resize the input buffer until a new row is found.

Therefore, we can allocate buffers on the heap to handle a longer line. When we parse a longer line from the client, we can ArrayPool<byte> improve this by using the avoid duplicate allocation buffer.

Async Task Processlinesasync (NetworkStream stream) {byte[] buffer = arraypool<byte>.    Shared.rent (1024);    var bytesbuffered = 0;    var bytesconsumed = 0; while (true) {//The number of bytes remaining in the calculation in buffer var bytesremaining = buffer.        length-bytesbuffered; if (bytesremaining = = 0) {//doubles the buffer size and copies the previously buffered data to the new buffer var newbuffer = arraypool< Byte>. Shared.rent (buffer.            Length * 2); Buffer.blockcopy (buffer, 0, newbuffer, 0, buffer.            Length); Throw the old buffer back into the pool arraypool<byte>.            Shared.return (buffer);            buffer = Newbuffer; bytesremaining = buffer.        length-bytesbuffered; } var bytesread = await stream.        Readasync (buffer, bytesbuffered, bytesremaining);        if (bytesread = = 0) {//EOF end break;                }//Track the number of buffered bytes bytesbuffered + = Bytesread; Do {//Find the end of a line in the buffered data Lineposition = array.indexof (buffer, (byte) ' \ n ', bytesconsumed, bytesbuffered-bytesconsumed); if (lineposition >= 0) {//calculates the length of a row according to the offset var linelength = Lineposition-bytes                consumed;                Handle this line processline (buffer, bytesconsumed, linelength);            Move bytesconsumed in order to skip the lines we have disposed of (including \ n) bytesconsumed + = linelength + 1;    }} while (Lineposition >= 0); }}

This code works, but now we are resizing the buffer to produce more copies of the buffer. It will use more memory because it does not shrink the size of the buffer after a row of rows is processed by the code. To avoid this, we can store the buffer sequence instead of resizing each time it exceeds the size of 1KiB.

In addition, we will not increase the 1KiB buffer until it is completely empty. This means that we eventually pass to ReadAsync smaller buffers, which will result in more calls to the operating system.

To alleviate this situation, we will allocate a new buffer when there are less than 512 bytes left in the existing buffer:

Translator Note: This code is too complicated, too lazy to translate the comments, we will see it

public class buffersegment{public byte[] Buffer {get; set;}    public int Count {get; set;} public int Remaining = Buffer.length-count;}    Async Task Processlinesasync (NetworkStream stream) {const int minimumbuffersize = 512;    var segments = new List<buffersegment> ();    var bytesconsumed = 0;    var bytesconsumedbufferindex = 0; var segment = new Buffersegment {Buffer = Arraypool<byte>.    Shared.rent (1024)}; Segments.    ADD (segment); while (true) {//Calculate the amount of bytes remaining in the buffer if (segment. Remaining < Minimumbuffersize) {//Allocate a new segment segment = new Buffersegment {B Uffer = Arraypool<byte>.            Shared.rent (1024)}; Segments.        ADD (segment); } var bytesread = await stream. Readasync (segment. Buffer, segment. Count, segment.        Remaining);        if (bytesread = = 0) {break; }//Keep track of the amount of buffered BYtes segment.        Count + = Bytesread; while (true) {//"Look for a" EOL in the list of segments Var (segmentindex, Segmentoffset) =            IndexOf (segments, (byte) ' \ n ', Bytesconsumedbufferindex, bytesconsumed); if (segmentindex >= 0) {//Process the line Processline (segments, Segmentinde                x, Segmentoffset);                Bytesconsumedbufferindex = Segmentoffset;            bytesconsumed = Segmentoffset + 1;            } else {break; }}//Drop fully consumed segments from the list so we don ' t look at them again for (var i = Bytesco Nsumedbufferindex; I >= 0;            -i) {var consumedsegment = segments[i];                Return all segments unless the current segment if (consumedsegment! = segment) { Arraypool<byte>.             Shared.return (Consumedsegment.buffer);   Segments.            RemoveAt (i); }}}} (int segmentindex, int segmentoffest) IndexOf (list<buffersegment> segments, byte value, int startbuf    Ferindex, int startsegmentoffset) {var first = true; for (var i = Startbufferindex; i < segments. Count;        ++i) {var segment = Segments[i]; Start from the correct offset var offset = first?        startsegmentoffset:0; var index = Array.indexof (segment. Buffer, value, offset, segment.        Count-offset);            if (index >= 0) {//Return the buffer index and the index within that segment where EOL is found        Return (i, index);    } first = false; } return (-1,-1);}

This code just gets a lot more complicated. When we are looking for separators, we also track the filled buffer sequence. To do this, we use the List<BufferSegment> find new row delimiter here to represent buffered data. The result is, ProcessLine and is IndexOf now accepted List<BufferSegment> as a parameter instead of byte[],offset和count one. Our parsing logic now needs to process one or more buffer sequences.

Our server now processes a partial message that uses pooled memory to reduce overall memory consumption, but we also need to make more changes:

    1. We use byte[] and arraypool<byte> is just a normal managed array. This means that whenever we execute readasync or writeasync , these buffers are pinned for the lifetime of the asynchronous operation (to interoperate with the native IO API on the operating system). This has a performance impact on the GC because fixed memory cannot be moved, which can result in heap fragmentation. Depending on how long the asynchronous operation hangs, the implementation of the pool may need to be changed.
    2. can optimize throughput by decoupling read logic and processing logic . This creates a batch effect so that the parsing logic can use a larger buffer chunk instead of just reading more data after parsing a single row. This introduces some additional complexity
      • We need two loops that run independently of each other. A read socket and a parse buffer.
      • when data is available, we need a way to signal to the parsing logic .
      • We need to decide what happens if a loop reads the socket "too fast." If parsing logic cannot be followed, we need a way to restrict read loops (logic) . This is often referred to as "flow control" or "back pressure".
      • We need to make sure things are thread-safe. We now share multiple buffers between the read Loop and the parse loop , and these buffers run independently on different threads.
      • Memory Management Logic is now distributed across two different code snippets, from which the code that fills the buffer pool is read from the socket, and the code that fetches the data from the buffer pool is parse logic .
      • We need to be very careful how we handle the buffer sequence after the parse logic is complete. If we are not careful, we may return a sequence of buffers that are still written by the socket read logic.

Complexity is at an extreme (we don't even cover all cases). High-performance network applications often mean writing very complex code to get higher performance from the system.

The goal of System.IO.Pipelines is to make this type of code easier to write.

TCP Server using System.IO.Pipelines

Let's take a look at what this example looks like System.IO.Pipelines:

Async Task Processlinesasync (socket socket) {var pipe = new pipe (); Task writing = Fillpipeasync (socket, pipe.    Writer); Task reading = Readpipeasync (pipe.    Reader); Return Task.whenall (reading, writing);}    Async Task Fillpipeasync (socket socket, pipewriter writer) {const int minimumbuffersize = 512; while (true) {//allocates at least 512 bytes from pipewriter memory<byte> Memory = writer.        GetMemory (minimumbuffersize); try {int bytesread = await socket.            Receiveasync (memory, Socketflags.none);            if (bytesread = = 0) {break; }//Tell pipewriter how many writers are read from the socket.        Advance (Bytesread);            } catch (Exception ex) {LogError (ex);        Break }//Tag data available, let Pipereader read Flushresult result = await writer.        Flushasync (); if (result.        iscompleted) {break; }}//Tell Pipereader no more data Writer.complete ();} Async TAsk Readpipeasync (Pipereader reader) {while (true) {Readresult result = await reader.        Readasync (); readonlysequence<byte> buffer = result.        Buffer; Sequenceposition?        Position = NULL; Do {//Find a line at the end of the buffered data position = buffer.            Positionof ((byte) ' \ n '); if (position! = NULL) {//handles this line processline (buffer. Slice (0, position.                                Value)); Skip this line +\n (basically position main position? ) buffer = buffer. Slice (buffer. GetPosition (1, position.            Value));        }} while (position! = NULL); Tell Pipereader how many buffer reader we are dealing with. Advanceto (buffer. Start, buffer.        END); If there is no more data, stop all to go if (result.        iscompleted) {break; }}//Mark Pipereader as Complete reader.complete ();}

The pipelines version of our line reader has 2 loops:

    • FillPipeAsyncRead from socket and write to Pipewriter.
    • ReadPipeAsyncReads and resolves the incoming rows from the Pipereader.

Unlike the original example, an explicit buffer is not allocated anywhere. This is one of the core functions of the pipeline. All buffer management is delegated to the Pipereader/pipewriter implementation.

This makes it easier to use code to focus on business logic rather than complex buffer management.

In the first loop, we first call to PipeWriter.GetMemory(int) fetch some memory from the underlying writer, and then we call the PipeWriter.Advance(int) amount of data that tells Pipewriter we actually write to the buffer. Then we call PipeWriter.FlushAsync() to provide the data to Pipereader.

In the second loop, we are using the Pipewriter final from the buffer socket. When the call PipeReader.ReadAsync() returns, we get a readresult containing 2 important information, including the ReadOnlySequence<byte> data read in form, and bool IsCompleted let reader know if writer is finished (EOF). After finding the end of line (EOL) delimiter and parsing the line, we sliced the buffer to skip what we had already done, and then we called to PipeReader.AdvanceTo tell Pipereader how much data we consumed.

At the end of each cycle, we completed reader and writer. This allows the underlying pipe to release all the memory it allocates.

System.IO.Pipelines

In addition to handling memory management, other core pipeline capabilities include the ability to view data without actually consuming the data in the pipe.

Pipereader has two core APIs ReadAsync and AdvanceTo . ReadAsyncget the pipe data and AdvanceTo tell Pipereader that these buffers are no longer needed so that they can be discarded (for example, back to the underlying buffer pool).

This is an example of an HTTP parser that reads part of the data buffer data before it receives a pipe to a valid starting line.

Readonlysequence<t>

The pipe implements a list of links that store the buffers passed between Pipewriter and Pipereader. Pipereader.readasync exposes a readonlysequence<t> new BCL type, which represents a view of one or more readonlymemory<t> segments, similar to the span<t> and memory<t> provide a view of arrays and strings.

The pipe internal maintenance points to reader and writer can assign or update their data collection. Sequenceposition represents a single point in the buffer chain that can be used to effectively slice the readonlysequence<t>.

This paragraph is difficult to translate, give the original
The Pipe internally maintains pointers to where the reader and writer is in the overall set of allocated data and updates them as data is written or read. The sequenceposition represents a single point in the linked list of buffers and can is used to efficiently slice the Read Onlysequence

Because readonlysequence<t> can support one or more segments, high-performance processing logic typically splits fast and slow paths (fast and slow paths) based on single or multiple segments.

For example, this is a routine that converts an ASCII readonlysequence<byte> to a string:

string GetAsciiString(ReadOnlySequence<byte> buffer){    if (buffer.IsSingleSegment)    {        return Encoding.ASCII.GetString(buffer.First.Span);    }    return string.Create((int)buffer.Length, buffer, (span, sequence) =>    {        foreach (var segment in sequence)        {            Encoding.ASCII.GetChars(segment.Span, span);            span = span.Slice(segment.Length);        }    });}
Back pressure and flow control

In a perfect world, read and parse work is a team: the read thread consumes the data from the network and puts it into the buffer, while the parsing thread is responsible for building the appropriate data structure. Typically, parsing will take more time than copying data blocks from the network only. As a result, the read thread can easily overwhelm the parsing thread. The result is that the read thread must slow down or allocate more memory to store the parsed thread's data. For best performance, there is a balance between frequent pauses and more memory allocation.

To solve this problem, the pipeline has two settings to control the data traffic, Pausewriterthreshold and Resumewriterthreshold. Pausewriterthreshold determines how much data should be buffered before the call is PipeWriter.FlushAsync paused. Resumewriterthreshold controls how much the reader consumes after writing can be resumed.

When the pipe data volume exceeds pausewriterthreshold, PipeWriter.FlushAsync it is blocked asynchronously. The amount of data becomes less than resumewriterthreshold when it is unlocked. Two values are used to prevent repeated blocking and unlocking near the limit.

IO scheduling

The continuation SynchronizationContext is usually called on the thread thread or on the current thread when using async/await.

Fine-grained control of where the IO is performed is important when performing IO, which makes it more efficient to utilize the CPU cache, which is critical for high-performance applications such as Web servers. Pipelines exposes a method that Pipescheduler determines where the asynchronous callback runs. This allows the caller to precisely control which thread is used for IO.

An example in practice is in the Kestrel LIBUV transport, where the IO callback runs on a dedicated event loop thread.

Other benefits of the Pipereader model:
    • Some underlying systems support "No buffer waiting", that is, you never need to allocate buffers until the data is actually available in the underlying system. For example, on Linux with Epoll, you can wait until the data is ready before you actually provide a buffer to read. This avoids the problem that having a large number of threads waiting for data does not immediately require a large amount of memory to be retained.
    • By default, pipe can easily write unit tests for network code because the parsing logic is decoupled from the network Code, so unit tests run parsing logic only for memory buffers, not directly from the network. It also makes it easy to test patterns that are difficult to test to send partial data. ASP. NET core uses it to test various aspects of the kestrel HTTP parser.
    • Systems that allow the underlying OS buffers (such as the registered IO API on Windows) to be exposed to user code are well suited for pipelines because buffers are always provided by the Pipereader implementation.
Other related types

As part of making System.IO.Pipelines, we've also added a number of new original BCL types:

    • MemoryPool<T>, IMemoryOwner<T> , MemoryManager<T> -. NET core 1.0 has been added ArrayPool<T> , in. NET Core 2.1, we now have a more general abstraction that works for any pool of work Memory<T> . This provides an extensibility point that allows you to insert more advanced allocation policies and control how buffers are managed (for example, by providing a pre-pinned buffer instead of a purely managed array).
    • IBufferWriter<T>-Represents a sink for writing synchronous buffered data. (Pipewriter achieve this)
    • Ivaluetasksource- ValueTask<T> has existed since. NET core 1.1, but has some super-privileges in. NET Core 2.1, allowing unassigned waits for asynchronous operations. For more information, see https://github.com/dotnet/corefx/issues/27445.
How do I use pipelines?

The API exists System.IO.Pipelines in the NuGet package.

The following is an example of a. NET Core 2.1 Server application that uses a pipeline to process line-based messages (the example above) Https://github.com/davidfowl/TcpEcho. It should run ' dotnet run ' (or by running in Visual Studio). It listens for the socket on port 8087 and writes the received message to the console. You can use a client such as netcat or putty to establish a connection to 8087 and send a line-based message to make it work correctly.

System.IO.Pipelines:. NET on high performance IO

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.