1. Create an additive variable
public <T> accumulator<t> Accumulator (T initialvalue, accumulatorparam<T> "Add" values to using the + = method. Only the driver can access the accumulator ' s value. -- (undocumented) Returns: (undocumented)
Using the above method of Sparkcontext, you can create an additive variable. By default, T here is int or double, so it is not possible to create an additive variable with T as long.
2.AccumulatorParam Introduction
Concept:
The initial value of the Initialvalue:accumulator, which is the initialvalue passed when the sparkcontext.accululator is called
The initial value of the Zerovalue:accumulatorparam, which is the return value of the zero method.
Suppose the sample data set is simple={1,2,3,4}
Execution order:
1. Call zero (InitialValue) and return to Zerovalue
2. Call Addaccumulator (zerovalue,1) to return to V1.
Call Addaccumulator (v1,2) to return v2.
Call Addaccumulator (v2,3) to return v3.
Call Addaccumulator (v3,4) to return v4.
3. Call Addinplace (INITIALVALUE,V4)
So the end result is zerovalue+1+2+3+4+initialvalue.
3. Implement Accumulatorparam
ImportOrg.apache.spark.AccumulatorParam; Public classLongaccumulatorImplementsAccumulatorparam<long>{ //after the Addaccumulator method is executed, the method is finally executed and the value is added to init. @Override PublicLong Addinplace (long init, Long value) {//TODO auto-generated Method Stub//return arg0+arg1;System.out.println (init+ ":" +value); returninit+value; } /** Init is the Sparkcontext.accumulator (init) parameter init. * The return value here is the cumulative starting value. Note Oh, he can not be equal to init. * * If Init=10,zero (init) = 0, then the operation process is as follows: * v1:=0+step * v1:=v1+step * ... * Last V1:=v1+init **/@Override PublicLong Zero (long init) {//TODO auto-generated Method StubSystem.out.println (init); return0l; } @Override Publiclong Addaccumulator (Long value, long step) {//TODO auto-generated Method StubSystem.out.println (value+ "," +step); returnValue+step; } }
Next use it.
Importjava.util.Arrays;Importjava.util.List;ImportOrg.apache.spark.Accumulator;Importorg.apache.spark.SparkConf;ImportOrg.apache.spark.api.java.JavaRDD;ImportOrg.apache.spark.api.java.JavaSparkContext;Importorg.apache.spark.api.java.function.VoidFunction; Public classAccumulatordemo { Public Static voidMain (String[]args) {sparkconf conf=NewSparkconf (). Setappname ("Accumulatordemo"). Setmaster ("local"); Javasparkcontext SC=Newjavasparkcontext (conf); Accumulator<Long> Acc=sc.accumulator (0L,Newlongaccumulator ()); List<Long> Seq=arrays.aslist (1l,2l,3l,4l); Javardd<Long> rdd=sc.parallelize (seq); Rdd.foreach (NewVoidfunction<long>() {@Override Public voidCall (Long arg0)throwsException {acc.add (arg0); } }); System.out.println (Acc.value ());; }
Spark Custom additive variable (accmulator) Accumulatorparam