Liaoliang on Spark performance optimization nineth season spark tungsten memory use complete decryption

Source: Internet
Author: User

Content:

1, exactly what is page;

2, page specific two ways to achieve;

3, page of the use of the source of the detailed;

What is page============ in ==========tungsten?

1, in Spark in fact there is no page this class!!! In essence, page is a data structure (similar to stack, list, etc.), from the OS level, page represents a memory block in the page can store data, there are many different page in the OS, when to get the data, The first thing to do is to locate the data in the page, and after the page is found, the data is extracted from the page according to the specific rules (for example, offset and length of the data);

2. What exactly is the page in spark? Read the source of the time to study Memoryblock.java, is from the Taskmemorymanager.java in to see the Discovery Memoryblock is page

Public classMemoryblockextendsmemorylocation {

Private Final Longlength;

  /**
* Optional page number; used when the Memoryblock represents a page allocated by a
* Taskmemorymanager. This field was public so that it can be modified by the Taskmemorymanager,
* which lives in a different package.
   */
  public intpagenumber= -1;

PublicMemoryblock (@NullableObject obj, LongOffset, LongLength) {
Super(obj, Offset;
This.length= length;
 }

/**
* Returns the size of the memory block.
   */
  Public Longsize() {
returnlength;
 }

/**
* Creates a memory block pointing to the memory used by the long array.
   */
  Public StaticMemoryblockFromlongarray(Final Long[] array) {
return newMemoryblock (Array, Platform.Long_array_offset, Array.length*8);
 }
}

Memoryblock represents the page, the data inside may be on-heap or off-heap, so the first parameter of the constructor above can be empty, On-heap is object, Off-heap is no object.

650) this.width=650; "src="/e/u261/themes/default/images/spacer.gif "style=" Background:url ("/e/u261/lang/zh-cn/ Images/localimage.png ") no-repeat center;border:1px solid #ddd;" alt= "Spacer.gif"/>

650) this.width=650; "src="/e/u261/themes/default/images/spacer.gif "style=" Background:url ("/e/u261/lang/zh-cn/ Images/localimage.png ") no-repeat center;border:1px solid #ddd;" alt= "Spacer.gif"/>

If the On-heap way, the memory allocation is heapmemoryallocator done.

/**
* A Simple {@linkMemoryallocator} that can allocate up to 16GB using a JVM long primitive array.
*/
Public classHeapmemoryallocatorImplementsMemoryallocator {

@GuardedBy("This")
Private FinalMap<long, Linkedlist<weakreference<memoryblock>>>bufferpoolsbysize=
NewHashmap<> ();

private static final intpooling_threshold_bytes=1024x768*1024x768;

  /**
* Returns True if allocations of the given size should go through the pooling mechanism and
* False otherwise.
   */
  Private BooleanShouldpool(LongSize) {
//Very small allocations is less likely to benefit from pooling.
    returnSize >=pooling_threshold_bytes;
 }

@Override
  PublicMemoryblockAllocate(LongSizethrowsOutOfMemoryError {
off(Shouldpool (size)) {
synchronized( This) {
FinalLinkedlist<weakreference<memoryblock>> pool =bufferpoolsbysize. Get (size);
if(Pool! =NULL) {
while(!pool.isempty ()) {
Finalweakreference<memoryblock> blockreference = Pool.pop ();
FinalMemoryblock memory = Blockreference.get ();
if(Memory! =NULL) {
assert(memory.size () = = size);
returnMemory;
           }
}
bufferpoolsbysize. Remove (size);
       }
}
}
Long[] Array =New Long[(int) ((Size +7) /8)];
return newMemoryblock (Array, Platform.Long_array_offset, Size;
 }

If the Off-heap way, the memory allocation is unsafememoryallocator done.

/**
* A Simple {@linkMemoryallocator} that uses {@codeUnsafe} to allocate off-heap memory.
 */
Public classUnsafememoryallocatorImplementsMemoryallocator {

@Override
  PublicMemoryblockAllocate(LongSizethrowsOutOfMemoryError {
LongAddress = Platform.allocatememory(size);
return newMemoryblock (NULL,Address, Size;
 }

========== How to use page ============

1, in the Taskmemorymanager through the package page to locate the data, the location of the time if it is on-heap, then find the object, and then the object through the offset to address the specific location, and if it is off-heap, then direct positioning;

2, a key question is, how to determine the data? This time, we need to design a specific algorithm

The Taskmemorymanager has been written ready for a machine with 32T of memory.

Liaoliang Teacher's card:

China Spark first person

Sina Weibo: Http://weibo.com/ilovepains

Public Number: Dt_spark

Blog: http://blog.sina.com.cn/ilovepains

Mobile: 18610086859

qq:1740415547

Email: [Email protected]


This article from "a Flower proud Cold" blog, declined reprint!

Liaoliang on Spark performance optimization nineth season spark tungsten memory use complete decryption

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.