Off-heap Memory in Apache Flink and the Curious JIT compiler

Last Update:2016-04-08 Source: Internet

Author: User

Tags apache flink

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Https://flink.apache.org/news/2015/09/16/off-heap-memory.html

Running data-intensive code in the JVM and making it well-behaved is tricky. Systems that put billions of data objects naively onto the JVM heap face unpredictable outofmemoryerrors and garbage colle Ction stalls. Of course, you still want to keep your data on memory as much as possible, for speed and responsiveness of the Processi ng applications. In this context, "off-heap" has become almost something like a magic word to solve these problems.

In the This blog post, we'll look at how Flink exploits off-heap memory.
The feature is part of the upcoming release, and you can try it out with the latest nightly builds. We'll also give a few interesting insights into the behavior for Java's JIT compiler for highly optimized methods and lo Ops.

Why actually bother with off-heap memory?

Given that Flink have a sophisticated level of managing on-heap memory, why does we even bother with off-heap memory? It is true that the "out of memory" have been much less of a problem for Flink because of its heap memory management T Echniques. Nonetheless, there is a few good reasons to offer the possibility-move Flink ' s managed memory out of the JVM heap:

Very Large JVMs (100s of GBytes heap memory) tend to be tricky. It takes long to start them (allocate and initialize heap) and garbage collection stalls can be huge (minutes). While newer incremental garbage collectors (like G1) mitigate the problem to some extend, an even better solution are to J UST make the heap much smaller and allocate Flink ' s managed memory chunks outside the heap.

I/O and network efficiency:in Many cases, we write memorysegments to disk (spilling) or to the network (data transfer). Off-heap memory can is written/transferred with zero copies, while the heap memory always incurs an additional memory copy.

Off-heap memory can actually is owned by other processes. That is, cached data survives process crashes (due to user code exceptions) and can is used for recovery. Flink does isn't exploit that, yet, but it's interesting future work.

Flink traditional ' on-heap ' memory management mechanism, has been able to solve a lot of Java about the ' out of memory ' or GC problems, then why we use the ' off-heap ' technology,

1. Very large JVM will take a long time to start, and the cost of GC will be very high
2. Heap is required to copy at least once while writing disk or network, while off-heap can implement zero copy
3. Off-heap memory is process-shared, JVM process crash does not lose data

The opposite question is also valid. Why should Flink ever don't use off-heap memory?

On-heap is easier and interplays better with tools. Some container environments and monitoring tools get confused when the monitored heap size does not remotely reflect the A Mount of memory used by the process.
Short lived memory segments is cheaper on the heap. Flink sometimes needs to allocate some short lived buffers, which works cheaper on the heap than Off-heap.
Some operations is actually a bit faster on heap memory (or the JIT compiler understands them better).

Why Flink not use off-heap memory directly?

The more powerful things, the more troublesome the general,

So in general case, using On-heap is enough.

The Off-heap Memory implementation

Given that all memory intensive internal algorithms is already implemented against MemorySegment the, our implementation to SWITC H to off-heap memory is actually trivial.
You can compare it to replacing all ByteBuffer.allocate(numBytes) calls with ByteBuffer.allocateDirect(numBytes) .
In Flink's case it meant that we made the MemorySegment abstract and added the and HeapMemorySegment OffHeapMemorySegment subclasses.
The OffHeapMemorySegment takes the Off-heap memory pointer from a and java.nio.DirectByteBuffer implements its specialized access methods using sun.misc.Unsafe .
We also made a few adjustments to the startup scripts and the deployment code to make sure that the JVM is permitted Enoug H Off-heap memory (direct memory, -xx:maxdirectmemorysize).

There is not much difference between using OFF-HEAP in the memory management mechanism and using ON-HEAP.

Use ByteBuffer . Allocate (numbytes) to allocate heap memory compared to NIO, and ByteBuffer.allocateDirect(numBytes) allocate off-heap memory

Flink, yes.MemorySegment，生成两个子类，HeapMemorySegment and OffHeapMemorySegment

Which OffHeapMemorySegment , in order to java.nio.DirectByteBuffer的形式 use off-heap memory, through sun.misc.Unsafe the interface to operate these memory

Understanding the JIT and tuning the implementation

MemorySegmentthe was (before) a standalone class, it was final (had no subclasses). Via Class Hierarchy Analysis (CHA), the JIT compiler is able to determine this all of the accessor method calls Go to one specific implementation. That it, all method calls can be perfectly de-virtualized and inlined, which are essential to performance, and the basis F or all further optimizations (like vectorization of the calling loop).

With-different memory segments loaded at the same time, the JIT compiler cannot perform the same level of optimization Any further, which results in a noticeable difference in performance:a slowdown of about 2.7 x in the following example:

This is a performance optimization issue,

One of the questions raised here is that ifMemorySegment是standalone class类，没有之类，那么会比较高效，因为编译的时候，他所调研的method都是确定的，可以提前做优化；如果具有两个子类，那么只有到真正运行到时候才知道是哪个子类，这样就不能提前做优化；

实际测，性能的差距在2.7倍左右

解决方法：

Approach 1:make sure that only one memory segment implementation is ever loaded.

We re-structured The code a bit to make sure this all places that produce long-lived and short-lived memory segments Insta Ntiate the same memorysegment subclass (Heap-or off-heap segment). Using factories rather than directly instantiating the memory segment classes, this is straightforward.

如果在代码里面只可能实例化其中的一个子类，另一个子类根本就没有被实例化过，那么JIT会意识到，并做优化；我们可以用factories来实例化对象，这样更方便；

Approach 2:write one segment that handles both heap and off-heap memory

We created a class HybridMemorySegment which handles transparently both Heap-and off-heap memory. It can be initialized either with a byte array (heap memory), or with a pointer to a memory region outside the heap (off-h EAP memory).

The second method is to use

HybridMemorySegment，同时处理heap和off-heap，这样就不需要子类 
并且有tricky的方式，可以做到透明的处理两种memory

Details to see the original

Off-heap Memory in Apache Flink and the Curious JIT compiler

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More