Giraph source code analysis (9) Analysis of Aggregators principles

Source: Internet
Author: User
I am original, reprinted please indicate the source! Welcome to the Giraph technical exchange group: 228591158Giraph for the usage of Aggregator, refer to the official document: giraph.apache.orgaggregators.html. This article focuses on how to implement Aggregators in Giraph. Basic principle: in each super step, each Worker calculates

I am original, reprinted please indicate the source! Welcome to join the Giraph technical exchange group: 228591158 Giraph Aggregator usage, please refer to the official documentation: http://giraph.apache.org/aggregators.html, This article focuses on how to resolve Giraph Aggregators. Basic principle: in each super step, each Worker calculates

I am original, reprinted please indicate the source! Welcome to joinGiraph technical exchange group228591158

In Giraph Aggregator usage, please refer to the official document: http://giraph.apache.org/aggregators.html, This article focuses on how to resolve Giraph implementationAggregators.

Basic Principles: In each superstep, each Worker calculates the local aggregation value. After superstep computing is complete, the local aggregation value is sent to the Master node for summary. After MasterCompute () is executed, the global aggregation value is sent back to all Workers.

Disadvantages: When an application (or algorithm) uses multiple Aggregators, the Master node must complete the computing of all the aggregates. Because the Master needs to accept, process, and send a large amount of data, both in terms of computing and network communication layers, it will cause the Master to become a system bottleneck.

ImprovementSharded aggregators is used. At the end of each superstep, each Worker is distributed to a Worker, which accepts and aggregates the values sent to other Workers. Then, Workers sends all their aggregates to the Master, so that the Master does not need to execute any clustering, but only receives the final value of each clustering. In MasterCompute. after compute is executed, the Master does not directly send all the aggregates to all Workers, but sends them to the Worker to which the aggregation belongs, then, each Worker sends the Worker to all Workers.

First, the communication protocol between the Master <--> Worker and Worker <--> Worker is provided. The received messages are parsed and stored in the doRequest (ServerData serverData) method in each class.
1). org. apache. giraph. comm. requests. SendWorkerAggregatorsRequest class. Worker --> Worker Owner
Function: each worker sends the partial aggregated values of the current superstep to the owner of the Aggregator.
2). org. apache. giraph. comm. requests. SendAggregatorsToMasterRequest class. Worker Owner --> Master
Function: each Worker sends the final aggregated values of its Aggregator to the master.
3). org. apache. giraph. comm. requests. SendAggregatorsToOwnerRequest class. Master --> Worker Owner.
Function: the master sends the final aggregated values or aggregators to the owner of the Aggregator.
4). org. apache. giraph. comm. requests. SendAggregatorsToWorkerRequest class. Worker Owner --> Worker
Function: send the final aggregated values to other workers. The sender is the owner of the Aggregator, and the receiver is all workers except the sender.


Response/response ++ response/response + 1o6zI58/Co7o8L3A + CjxwPjxpbWcgc3JjPQ = "http://www.2cto.com/uploadfile/Collfiles/20140523/201405230851307.jpg" alt = "\">

The finishSuperStep (MasterClient masterClient) method core content is as follows:

<喎?http: www.2cto.com kf ware vc " target="_blank" class="keylink"> VcD4KPHByZSBjbGFzcz0 = "brush: SQL;">/*** Finalize aggregators for current superstep and share them with workers */public void finishSuperstep (MasterClient masterClient) {for (AggregatorWrapper Aggregator: aggregatorMap. values () {if (aggregator. isChanged () {// if master compute changed the value, use the one he chose aggregator. setpreviusaggregatedvalue (aggregator. getCurrentAggregatedValue (); // reset aggregator for the next superstep aggregator. resetCurrentAggregator () ;}}/*** sends the aggregate to the Worker. Sent content: * 1). Name of the aggregator * 2). Class of the aggregator * 3). Value of the aggretator */try {for (Map. Entry > Entry: aggregatorMap. entrySet () {masterClient. sendAggregator (entry. getKey (), entry. getValue (). getAggregatorClass (), entry. getValue (). getpreviusaggregatedvalue ();} masterClient. finishSendingAggregatedValues ();} catch (IOException e) {throw new IllegalStateException ("finishSuperstep:" + "IOException occurred while sending aggregators", e );}}
Question 1: How to Determine the Worker Owner of aggregator?

A: Determine the Worker to which aggregator belongs based on its Name. The calculation method is as follows:

/*** Calculate the Worker * parameter aggregatorName: Name of the aggregator Based on aggregatorName and all workers lists. * parameter workers: list of Workers * return value: worker which owns the aggregator */public static WorkerInfo getOwner (String aggregatorName, List
 
  
Workers) {// use the HashCode () value of aggregatorName to modulo the total number of Workers int index = Math. abs (aggregatorName. hashCode () % workers. size (); return workers. get (index); // return the Worker to which aggregator belongs}
 
Question 2: How does a Worker determine whether it has received all its own aggregators?

A: The number of aggregators sent by the Master to a Worker at the same time. The SendAggregatorsToOwnerRequest class is used to encapsulate and parse messages.

2. The Worker accepts the Aggregator sent by the Master. The Worker sends the received aggregate value to all other Workers, and each Workers obtains the global aggregation value of the previous superstep.

As mentioned above, each Worker has a ServerData object. The two member variables of Aggregator in the ServerData class are as follows:

// Save the previous superstep aggregatorsprivate final OwnerAggregatorServerData ownerAggregator; // Save the previous superstep aggregatorsprivate final AllAggregatorServerData allAggregatorData;

We can see that ownerAggregatorData is used to store the Worker sent to the Worker in the current superstep Master, and allAggregatorData is used to save the global clustering value of the previous superstep. The ownerAggregatorData and allAggregatorData values are initialized in the doRequest (ServerData serverData) method in the SendAggregatorsToOwnerRequest class, as follows:

Public void doRequest (ServerData serverData) {DataInput input = getDataInput (); AllAggregatorServerData aggregatorData = serverData. getAllAggregatorData (); try {// The number of Aggregators received. In the CountingOutputStream class, there is counter. // each time you add a clustering object to the output stream, the count is increased by 1. during sending, insert the value to the top of the output stream in the flush method. Int numAggregators = input. readInt (); for (int I = 0; I <numAggregators; I ++) {String aggregatorName = input. readUTF (); String aggregatorClassName = input. readUTF (); if (aggregatorName. equals (AggregatorUtils. SPECIAL_COUNT_AGGREGATOR) {LongWritable count = new LongWritable (0); // The total number of requests sent by the Master to the Worker. count. readFields (input); aggregatorData. receivedRequestCountFromMaster (count. get (), getSenderTaskId ();} else {Class> aggregatorClass = AggregatorUtils. getAggregatorClass (aggregatorClassName); aggregatorData. registerAggregatorClass (aggregatorName, aggregatorClass); Writable aggregatorValue = aggregatorData. createAggregatorInitialValue (aggregatorName); aggregatorValue. readFields (input); // assign the value of the last global aggregation received to allAggregatorData aggregatorData. setAggregatorValue (aggregatorName, aggregatorValue); // ownerAggregatorData only accepts the aggregate serverData. getOwnerAggregatorData (). registerAggregator (aggregatorName, aggregatorClass) ;}} catch (IOException e) {throw new IllegalStateException ("doRequest:" + "IOException occurred while processing request", e );} // accept a request, reduce the count by 1, and add the received Data to the List of allAggregatorServerData
 
  
AggregatorData. receivedRequestFromMaster (getData ());}
 

Each Worker calls the prepareSuperStep () method of the BspServiceWorker class to distribute the clustering values and receive the clustering values sent by other Workers before starting the calculation. The call relationship is as follows:


The BspServiceWorker class prepareSuperStep () method is as follows:

@ Overridepublic void prepareSuperstep () {if (getSuperstep ()! = INPUT_SUPERSTEP) {/** aggregatorHandler is of the WorkerAggregatorHandler type. see the class inheritance relationship of MasterAggregatorHandler in the above article. * workerAggregatorRequestProcessor is declared as WorkerAggregatorRequestProcessor (Interface) type. * It is actually a Worker instance and is used to send clustering values between workers. */AggregatorHandler. prepareSuperstep (workerAggregatorRequestProcessor );}}

The WorkerAggregatorHandler class's prepareSuperstep (WorkerAggregatorRequestProcessor requestProcessor) method is as follows:

Public void prepareSuperstep (WorkerAggregatorRequestProcessor requestProcessor) {AllAggregatorServerData allAggregatorData = serviceWorker. getServerData (). getAllAggregatorData ();/*** wait until the Worker sent by the Master to the Worker has been accepted. * The returned value is all Data (aggregation) sent by the Master to the Worker) */Iterable
 
  
DataToDistribute = allAggregatorData. getDataFromMasterWhenReady (serviceWorker. getMasterInfo (); // send the Data received from the Master node to all other Workers requestProcessor. distributeAggregators (datatodistrigators); // wait until the worker allAggregatorData sent to the worker is accepted. fillNextSuperstepMapsWhenReady (getOtherWorkerIdsSet (), previousAggregatedValueMap, currentAggregatorMap); // only clears the List of allAggregatorServerData
  
   
MasterData object // prepare allAggregatorData. reset () for the next superstep to receive the aggregation sent by the Master ();}
  
 
How does the Worker determine that all requests sent by all masters have been received? The main purpose is to describe how threads collaborate in a distributed environment. In the AllAggregatorServerData class, the variable masterBarrier of the TaskIdsPermitBarrier type is defined to determine whether to receive the Request sent by the Master. taskIdsPermitBarrier class mainly uses methods such as wait () and policyall () for control. When the obtained aggregatorName is equal to AggregatorUtils. SPECIAL_COUNT_AGGREGATORRequirePermits (long permits, int taskId) is called to increase the number of received arrivedTaskIds and the number of requests to be waited. waitingOnPermits.

  /**   * Require more permits. This will increase the number of times permits   * were required. Doesn't wait for permits to become available.   *   * @param permits Number of permits to require   * @param taskId Task id which required permits   */  public synchronized void requirePermits(long permits, int taskId) {    arrivedTaskIds.add(taskId);    waitingOnPermits += permits;    notifyAll();  }

After receiving a Request, the releaseOnePermit () method is called to reduce waitingOnPermits by 1.

3. In the Vertex. compute () method, each Worker aggregates its own values. After the calculation is complete, call the finishSuperstep (WorkerAggregatorRequestProcessor requestProcessor) method of the WorkerAggregatorHandler class to send the value of the local aggregate to the Worker of the sentence aggregate. the Worker of Aggregator collects the local aggregation values sent by all other Workers and sends them to the Master after the aggregation is completed for MasterCompute in the next superstep. use the compute () method. The finishSuperstep method is as follows:

 /**   * Send aggregators to their owners and in the end to the master   *   * @param requestProcessor Request processor for aggregators   */  public void finishSuperstep(      WorkerAggregatorRequestProcessor requestProcessor) {    OwnerAggregatorServerData ownerAggregatorData =        serviceWorker.getServerData().getOwnerAggregatorData();    // First send partial aggregated values to their owners and determine    // which aggregators belong to this worker    for (Map.Entry
 
  > entry :        currentAggregatorMap.entrySet()) {        boolean sent = requestProcessor.sendAggregatedValue(entry.getKey(),            entry.getValue().getAggregatedValue());        if (!sent) {          // If it's my aggregator, add it directly          ownerAggregatorData.aggregate(entry.getKey(),              entry.getValue().getAggregatedValue());        }    }    // Flush    requestProcessor.flush();    // Wait to receive partial aggregated values from all other workers    Iterable
  
   > myAggregators =        ownerAggregatorData.getMyAggregatorValuesWhenReady(            getOtherWorkerIdsSet());    // Send final aggregated values to master    AggregatedValueOutputStream aggregatorOutput =        new AggregatedValueOutputStream();    for (Map.Entry
   
     entry : myAggregators) {        int currentSize = aggregatorOutput.addAggregator(entry.getKey(),            entry.getValue());        if (currentSize > maxBytesPerAggregatorRequest) {          requestProcessor.sendAggregatedValuesToMaster(              aggregatorOutput.flush());        }       }    requestProcessor.sendAggregatedValuesToMaster(aggregatorOutput.flush());    // Wait for master to receive aggregated values before proceeding    serviceWorker.getWorkerClient().waitAllRequests();    ownerAggregatorData.reset();  }
   
  
 

The call relationship is as follows:


4. After Big synchronization, the Master calls the MasterAggregatorHandler class's prepareSusperStep (masterClient) method to collect the aggregate value. The method is as follows:

Public void prepareSuperstep (MasterClient masterClient) {// collect the clustering value of the previous superstep to prepare for master compute (AggregatorWrapper
 
  
Aggregator: aggregatorMap. values () {// if it is Persistent Aggregator, accumulate if (aggregator. isPersistent () {aggregator. aggregateCurrent (aggregator. getpreviusaggregatedvalue ();} aggregator. setpreviusaggregatedvalue (aggregator. getCurrentAggregatedValue (); aggregator. resetCurrentAggregator (); progressable. SS ();}}
 
Then the MasterCompute. the compute () method (the value of the aggregate may be modified). If the haltCompute () method of the MasterCompute class is called based on the value of the aggregate, indicates that the entire Job is to be ended. The Master will notify all Workers to end the entire job. If the haltCompute () method of the MasterCompute class is not called in this method, the Master will return to Step 1 to continue the iteration.

Note: There are three conditions for the end of a Job iteration:
1) Maximum number of iterations
2) There are no active vertices and no messages are being transmitted
3) terminate MasterCompute computing

Conclusion: To solve the problem that the Master node becomes a System Bottleneck under multiple Aggregator conditions. The Master only needs to perform simple data communication with all Aggregator Workers to compute and send global aggregation values, greatly reduces the workload of the Master.

Append:The preceding execution process is illustrated as follows.

Lab conditions:

1) one Master, four workers

2). Two Aggregators, marked as A1 and A2.

1. The Master sends Aggregators to the Workers, and receives the Worker of Aggregator as the Owner of the Aggregator. The Master sends A1 to Worker1 and A2 to Worker3. then Worker1 is the Owner of A1. Worker3 is the Owner of A2. This step is completed in the finishSuperStep (MasterClient masterClient) method of the MasterAggregatorHandler class, using the SendAggregatorsToOwnerRequest communication protocol. Note: Each Owner Worker may have multiple aggregators.


Master distribution Aggregator

2. Workers accepts the Aggregator sent by the Master and sends the Aggregator to other Workers. Worker1 sends A1 to Worker2, Worker3, and Worker4 respectively, and Worker3 sends A2 to Worker1, Worker2, and Worker4 respectively. This step is completed in the WorkerAggregatorHandler class's prepareSuperstep (WorkerAggregatorRequestProcessor requestProcessor) method, using the SendAggregatorsToMasterRequest communication protocol. After this step is completed, each Worker has the clustering machine A1 and A2 (specifically, the global final clustering value of the previous superstep ).


3. Each Worker calls the Vertex. compute () method to start computation and collect local aggregate values of Aggregator. For aggregation A1, the local aggregation values of Worker1, Worker2, Worker3, and Worker4 are recorded as follows:

A11, A12, A13, and A14; for clustering machine A2, the local clustering values of Worker1, Worker2, Worker3, and Worker4 are recorded as follows:

A21, A22, A23, and A24. After the computation is complete, each Worker sends the local clustering value to the Owner of the clustering machine. The Owner of the clustering machine merges and aggregates the data when it accepts the data. A11, A12, A13, and A14 are sent to Worker1 for global aggregation to get a1', A21, A22, A23, and A24 are sent to Worker3 for global aggregation to get a2 '.

The formula is as follows:


This part uses the SendWorkerAggregatorsRequest communication protocol. Worker1 and Worker3 should send the new values A1 and A2 summarized to the Master for MasterCompute in the next superstep. the compute () method uses the SendAggregatorsToMasterRequest communication protocol. This part is completed in the WorkerAggregatorHandler class finishSuperstep (WorkerAggregatorRequestProcessor requestProcessor) method. The process is shown in:

4. After the Master receives A_1 and A2 sent by Woker3 from Worker1, this step is completed in the MasterAggregatorHandler class prepareSusperStep (masterClient) method. Then, call the MasterCompute. compute () method. This method may modify the value of the aggregate, such as A_1 and a2 ''. In the masterCompute. compute () method, if the haltCompute () method of the MasterCompute class is called to terminate MaterCompute Based on the clustering value, the entire Job is ended. The Master will notify all Workers to end the entire job. If the haltCompute () method of the MasterCompute class is not called in this method, the Master will return to Step 1 to continue the iteration, send A_1 to Worker1 and a2'' to work3.



Complete!

I am original, reprinted please indicate the source! Welcome to joinGiraph technical exchange group228591158

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.