Analysis of IO scheduling in Golang and Erlang

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed.

The previous article on the comparison analysis of the scheduler left a few questions at the end: When a system has high concurrent IO access, such as a network server that typically handles hundreds or thousands of links concurrently, each link may be executed by a single user task, there will be a lot of blocking IO operations, If a single OS thread is assigned to each blocking operation, the system can easily degenerate into a multi-OS threading system, and the benefits of lightweight tasks will not be discussed. This article attempts to answer this question by analyzing the optimization mechanism of Go and Erlang for IO, especially the network IO, to understand its effect on the performance of the scheduler and the whole system.

Go's IO optimization mechanism--netpoller

Because go is a distributed language which is mainly oriented to the Internet environment, the concurrency of network IO is more important than the general Io, such as file reading and writing. The general Io,go is handled in the same way that the Syscall OS thread stripping is performed as described in the previous article. In general scenarios, there is not a lot of concurrent goroutine to read and write files at the same time, so the above way does not really cause the degradation of the scheduler. So the main IO optimizations are for the Io/net library.

Similarly, Erlang provides an efficient approach to network IO that differs from general IO in implementation, and is described later.

The go implementation leverages the non-blocking IO access mode provided by the OS and works withepll/kqueuesuch as IO event monitoring mechanism; But to bridge the differences between the async mechanism of the OS and the Go interface, GO has done some encapsulation in its library and provided a mechanism called Netpoller, "network polling" in the runtime layer to achieve network IO optimization. Specifically:

  • First, whenever a link is opened or received in go, its file handle is set to NONBLOCKING pattern. (Go Language library )
  • When an operation such as the corresponding Read/write is invoked, it will return directly without blocking, whether or not it succeeds. When the return value is Eagain, it indicates that the IO event has not arrived and needs to wait. At this point, the Go library function callsPollServerOfAddFd()Joins the corresponding file handle to the Netpoller monitoring pool and blocks the current goroutine. (Go Language Library, Netpoll.goc)
  • When there is idle P & M in the system (see here), runtime will first find the local ready queue, if it is empty, call Netpoller; Netpoller through the OS provided by the Epoll or kqueue mechanism, check the arrival of the IO event, and wake the corresponding goroutine back to the runtime, to re-execute. (runtime/proc.c:findrunnable ())
  • Finally, when Goroutine returns to the Go Language library context again, it calls Read/write and other IO operations to return smoothly. (Go Language library )

One of the IO optimization mechanisms of Erlang-"Async Threads Pool"

In Erlang, all IO operations need to be provided as port drivers, and the so-called port driver contains a set of C callback functions to respond to user process access, while the user process interacts with the port through a common messaging mechanism. Erlang Virtual Opportunity Dispatches port as a special task.

Real system calls, such as Read/write/flush, are encapsulated in the callback function of port, which causes the current scheduler execution thread to be blocked by the OS, which can affect the parallelism of the system when the scheduler executes the port of the response.

Erlang solves this problem by providing a set of OS threads as an asynchronous thread pool, blocking IO operations (in the form of function pointers) that are registered by the port into the operations queue of the asynchronous thread pool. The asynchronous thread performs a looping operation, takes out the IO task of the current task queue, and performs a blocking operation.

This approach is similar to the dispatch of go on non-net class Io and the execution of a blocking syscall: A separate OS thread is used to perform blocking operations.

The file IO of Erlang is basically implemented in this way.

Because Erlang maps the scheduler to an OS thread and says its dispatch is 1:1, it's actually inaccurate. Based on the asynchronous processing of blocking IO and the load-balancing mechanism described in the previous article, Erlang has actually implemented m:n scheduling, but the official document of Erlang does not say so, just that increasing the number of schedulers simply does not affect performance.

Erlang's IO Optimization Mechanism II--"System level activities"

As mentioned earlier, both Erlang and go are designed for server-side languages and therefore offer a special mechanism different from general IO to handle network IO.

Erlang's approach is to provide a special dispatch unit- System level activities-to dispatch asynchronous IO events. Its thinking is very similar to the Netpoller of Go:

  • First, the corresponding handle of the network link is set to the NONBLOCKING State;
  • An IO operation, if called before the response event arrives, registers the event it waits with in the IO event chain of the Erlang virtual machine;
  • check_io action to check whether the registered IO event has arrived ( take advantage of the OS poll operation ), And wakes up the user task that responds to the event blocking ( process or port ).

stealing   in handling IO events. Specifically, when a driver function calls IO operations, if the corresponding IO event does not arrive, it will also actively call  select_steal () steals other registered IO events, completes the corresponding read/write operation if the event has been triggered, and notifies the upper layer for subsequent processing.

Asynchronous IO mechanism in Libtask

As the predecessor of the Go Language, the Libtask library also implements the asynchronous IO mechanism, and the implementation is more concise.

Like go, in Libtask, IO operations are encapsulated for user-level tasks, and interfaces such as Fdread/fdwrite/fdwait/fdnoblock are implemented for asynchronous IO. ( in the example provided by Libtask, all IO operations are for network IO, so only the network IO situation is analyzed.) )

  • The link handle fdblock() is first set to the state by the call NONBLOCKING ;
  • Called afterfdread/fdwrite, once you returnEAGAIN, the callfdwait, registering for an IO event and recalling itself;
  • Libtask establishes a system task after the first receive IO Event registration fdtask , which checks for incoming IO events by invoking the poll system call and adds the corresponding task back to the ready queue.

Summary and Reference

The impact of IO optimization on the performance of the scheduler and even the language itself is understood through the above analysis. This is very much related to the application background of the two languages-server-side programming.

In general, an application must pass the specific functionality of the Syscall access operation, which involves the scheduling mechanism of the underlying OS, which, as the user-state Task Scheduler, must control the uncertainties introduced by the kernel scheduler. In particular, special and heavily accessed syscall such as IO operations must be designed with targeted optimization schemes to ensure high concurrency performance.

The implementation of Go and Erlang is different, but the core idea is similar, through the asynchronous IO optimization socket-based operation, and for the general file read and write, the execution thread and the running user task is directly blocked, the scheduler will then bind other executable tasks to other OS threads continue to execute.

This article, in addition to reference to the ERLANG/OTP and go Language source code, also refer to the following information:

    • Morsing "The Go Netpoller" Http://morsmachine.dk/netpoller
    • Ramblings "How Erlang does scheduling" http://jlouisramblings.blogspot.com/2013/01/how-erlang-does-scheduling.html
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.