Spark's Torrentbroadcast: implementing

Source: Internet
Author: User

Serialization and deserialization

As mentioned earlier, the key to Torrentbroadcast is the specially crafted serialization and deserialization method. The 1.1 version of Torrentbroadcast implements its own ReadObject and WriteObject methods, But 1.4.1 version of Torrentbroadcast does not implement its own ReadObject method, then it is how to serialize and deserialize it?

OBJ is the object being broadcast.
Private val numblocks:int = writeblocks (obj) protected def getValue () = { _ Value }@transientprivate lazy val _value:t = Readbroadcastblock ()

The Torrentbroadcast object can be thought of as being handled by three main stages: constructor, serialization, deserialization

Constructors

When the Torrentbroadcast object is constructed, the numblocks is initialized, and Writeblocks is executed. Writeblocks will perform operations such as serializing obj, chunking, storing into Blockmanager, and so on.

While the _value domain is lazy, the _value is not initialized when the Torrentbroadcast object is initialized, and Readbroadcastblock is not executed.

Serialization of

When an action is invoked on the driver side of the RDD, a task object is generated, the object referenced by the Task object is serialized, and a task object is deserialized for each task.

Torrentbroadcast need to ensure that the objects being broadcast are not serialized with the task. The following two points need to be noted:

Private class Torrentbroadcast[t:classtag] (obj:t, Id:long)   extends Broadcast[t] (ID) with Logging with Serializable {...}
@transient Private lazy val _value:t = Readbroadcastblock ()

The parameters in Scala's constructor do not necessarily become fields of the object, such as obj, a constructor parameter that is used only to construct an object, is not used to implement a method, is not a field of torrentbroadcast, and therefore is not serialized.

_value, while referencing the broadcast data, is @transient and therefore not serialized.

Deserialization

The key to deserialization is that the _value is not deserialized. Therefore, if a executor does not have a task using Torrentbroadcast's value method, the broadcast data will not be fetched on this executor side.

The key to implementing this functionality lies in Scala's lazy Val.

First, consider this: The lazy Val may be accessed concurrently by multiple threads, which triggers the initialization of lazy Val, but it is necessary to ensure that the initialization process is thread-safe, that is, the lazy Val is initialized only once, and the results of the initialization are visible to all threads. The simplest way to do this is to use this for synchronization, but this will be inefficient, and Scala's implementation of lazy Val uses a much more efficient method, but no matter what, lazy Val will have less access than normal Val.

Give an example of a double-checked locking idiom, sweet in scala!:

Lazy val Mylazyfield = Create ();

will be compiled into:

    Public volatile intBitmap$0; PrivateObject Mylazyfield;  PublicString Mylazyfield () {if((bitmap$0 & 1) = = 0)        {            synchronized( This)            {                if((bitmap$0 & 1) = = 0) {Mylazyfield=. .. bitmap$0 = bitmap$0 | 1; }            }        }        returnMylazyfield; }

That is, a volatile variable to determine whether the lazy Val has been initialized, by double-check lock to do the initialization.

Now there's a new problem:

1. Does the default serialization process trigger lazy Val to be initialized?

2. If the lazy Val is accessed before the Torrentbroadcast object is serialized and the initialization process is triggered, the broadcast data is related to a field that is Torrentbroadcast and will be serialized as well.

The answer to question 1 is not triggered. The answer to question 2 _value needs to be noted as transient, just like Torrentbroadcast did.

So, if the object returned by the Broadcast.value method is used frequently in a function, such as using it in a loop, a reference to the object is first created outside the loop to reduce some overhead.

However, this thread-safe mechanism of lazy Val is wasteful for Torrentbroadcast. Because the broadcast variable is serialized with the task, each thread has its own task object, that is, the broadcast object is not shared between threads. In fact, to ensure that different tasks running on the same JVM get the same broadcast object, the Readbroadcastblock method is synchronized using the class Torrentbroadcast.

Here's a look at the process of chunking the broadcast object

Block storage of Broadcast objects

This step is done when the Torrentbroadcast object is initialized.

By

Val numblocks:int = writeblocks (obj)

Trigger. Let's take a look at the Writeblocks method

Writeblocks
  Privatedef writeblocks (value:t): Int = {    //Store A copy of the broadcast variable in the "Driver so" tasks run on the driver//Do not create a duplicate copy of the broadcast variable ' s value.SparkEnv.get.blockManager.putSingle (Broadcastid, Value, Storagelevel.memory_and_disk, Tellmaster=false) Val Blocks=Torrentbroadcast.blockifyobject (value, BlockSize, SparkEnv.get.serializer, Compressioncodec)//type of blocks is Array[bytebuffer]Blocks.zipWithIndex.foreach { Case(block, i) = =SparkEnv.get.blockManager.putBytes (Broadcastblockid (ID,"Piece" + i),//storage with Broadcastblockid as Blockidblock, Storagelevel.memory_and_disk_ser, Tellmaster=true)} Blocks.length}

As the comments in the code say, Writeblocks will first place the broadcast object in the Blockmanager of the driver, in order to not create an extra copy of the broadcast object when the task is run in driver. Without this step, when the task is run on the driver side, as with the executor, a new broadcast object is created by the value method of the broadcast object, which makes the driver side have two copies of the object. However, it is not uncommon to actually run tasks on the driver side. So it is best to judge by Conf if it is necessary to do so.

Next, use the Blockifyobject method of the associated object to block the object, resulting in an array of bytebuffer. And then put these blocks into the Blockmanager, here are two points to note:

1. When the block is stored in the Blockmanager, the ID used is broadcastblockid (ID, "piece" + i). That is, with the ID of the broadcast object, and the total number of blocks, you can restore the ID used for all block storage. That's why Torrentbroadcast has to numblocks this field. The ID field is Val in the virtual class of broadcast, so the ID of all blocks can be divided according to the field of the Torrentbroadcast object. This is also true when restoring objects that are broadcast from these blocks.

2. When the partitioned block is stored in Blockmanager, the value of the Tellmaster field is true, which allows Master to know which Blockmanager stores the block. So the Blockmanager at the executor end can get this block from the Blockmanager of the driver end at first. Conversely, Tellmaster is false when the first sentence of Writeblocks is putsingle, because it is not intended for other Blockmanager to get putsingle in.

Blockifyobject

The job of Blockifyobject is to serialize the broadcast object, compress it if compression is enabled, and then write the resulting stream of bytes into a series of byte arrays.

Its return value type is: Array[bytebuffer], all of it is Bytebuffer, is for blockmanager ease of use, because Blockmanager Putbytes method accepts Bytebuffer as a parameter.

  def Blockifyobject[t:classtag] (      obj:t,      blocksize:int,      serializer:serializer,      = {    new  bytearraychunkoutputstream (blockSize)    = Compressioncodec.map (c = C.compressedoutputstream (BOS)). Getorelse (BOS)    = serializer.newinstance ()    =  Ser.serializestream (out)    serout.writeobject[t] (obj). Close ()    Bos.toArrays.map (bytebuffer.wrap)  } 

The key to its implementation is Bytearraychunkoutputstream, which implements the Java OutputStream interface. Its main body part is as follows:

Private class extends outputstream {  privatenew  Arraybuffer[array[byte]]  Private var lastchunkindex =-1  private var position = chunkSize  = {    allocatenewchunkifneeded ()    = b.tobyte    + = 1  }

Override Def write (Bytes:array[byte], Off:int, len:int): Unit = {...}
def Toarrays:array[array[byte]] = {...}


}

That is, it internally uses an array of length equal to chunksize to store the bytes being written.

Assemble and restore the objects being broadcast

On the executor side (if a task is executed in driver, it can also be on the driver side) the objects that have been cut together need to be assembled and restored to the broadcast object. This is triggered by access to the lazy Val _value.

@transientprivate lazy val _value:t = Readbroadcastblock ()

Readbroadcast will first search the local Blockmanager for the broadcast object that was previously deposited, so if a task in the same executor already has access to the _value, it will be able to fetch directly to the object that has been placed in the local blockmanager. ,

If there is no local, then call Readblocks to get the block that makes up the object, then restore the object with Unblockifyobject, then put it in Blockmanager so that the other tasks of the same executor do not have to be repeatedly assembled and restored.

 PrivateDef readbroadcastblock (): T =utils.tryorioexception {torrentbroadcast.synchronized{setconf (SparkEnv.get.conf)//read this broadcast object from the local blockmanager, according to BroadcastidSparkEnv.get.blockManager.getLocal (Broadcastid). Map (_.data.next ()) match { CaseSome (x) =//Local hasX.asinstanceof[t] CaseNone =//Local NoneLoginfo ("Started reading broadcast variable" +ID) Val Starttimems=System.currenttimemillis () Val blocks= Readblocks ()//If there is no local broadcastid corresponding to the broadcast block, attendLoginfo ("Reading broadcast variable" + ID + "took" +Utils.getusedtimems (starttimems)) Val obj=Torrentbroadcast.unblockifyobject[t] (blocks, SparkEnv.get.serializer, Compressioncodec)//Store the merged copy in Blockmanager so other tasks on this executor don ' t//need to re-fetch it.SparkEnv.get.blockManager.putSingle (//read it and put it in the Blockmanager.Broadcastid, obj, storagelevel.memory_and_disk, tellmaster =false) obj} }}

One of the details here is that the assembly of the restored object is put into Blockmanager with Putsingle, and the storage level is memory_and_disk, which means that when the Memorystore cannot accommodate the object being broadcast, Two tasks of the same executor may fetch two different objects (you need to study the Blockmanager related code to determine). If this happens, and the object being broadcast is thread-safe, then it is a waste of memory. If this does not occur, a executor of all the tasks that share a broadcast object can cause thread-safe problems. However, when you use a broadcast object, it needs to be read-only, and modifications to it may cause problems.

Torrentbroadcast is a block that is obtained by readblocks to form the serialized object.

  /**Fetch torrent blocks from the driver and/or and other executors.*/  PrivateDef readblocks (): array[bytebuffer] = {    //The obtained block is in the local Blockmanager and is reported to driver so that the other executor can get these blocks from this executor.Val blocks =NewArray[bytebuffer] (numblocks) Val BM=SparkEnv.get.blockManager//need to shuffle, to avoid all executor in the same order to download block, so that driver is still a bottleneck     for(PID <-random.shuffle (Seq.range (0, Numblocks))) {val Pieceid= Broadcastblockid (ID, "piece" + pid)//Assembly BroadcastblockidLogdebug (S "Reading piece $pieceId of $broadcastId")      //try to get it locally, because the previous attempt might have taken some blockdef Getlocal:option[bytebuffer] =Bm.getlocalbytes (Pieceid) def Getremote:option[bytebuffer]= Bm.getremotebytes (Pieceid). map {block = =//if a block is obtained from remote, it will be locally blockmanagerSparkEnv.get.blockManager.putBytes (Pieceid, block, Storagelevel.memory_and_disk_se R, Tellmaster=true) block} val block:bytebuffer=Getlocal.orelse (getremote). Getorelse (Throw NewSparkexception (S "Failed to get $pieceId of $broadcastId") ) blocks (PID)=block} blocks}

Readblocks is still very easy to understand, only when using putbytes here, the storage level used is memory_and_disk_ser, some strange, do not know why for these bytes also need serialization.

Summarize

Torrentbroadcast's implementation has some clever details, but the overall code is simple and easy to understand. There is so little code because Blockmanager has provided enough infrastructure.

Spark's Torrentbroadcast: implementing

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.