5 Spark Entry Key-value pair Foldbykey

Source: Internet
Author: User

The Foldbykey function is pairrdd<k, v> to V to do the merging, the method is this


As you can see, the first parameter is Zerovalue, which is used to perform a merge operation on the original V, and the latter argument is a jfunction operation.

For a pairrdd, such as Array ("a", 0), ("A", 2), ("B", 1), ("B", 2), ("C", 1))

When performing Foldbykey (2), and function is x+y operation, the operation process is this, first 2 to add the key "a" of the first element of the value, changed to ("a", 2), and then take the result of the initialization to execute "a" and subsequent elements, the result is (" A ", 4). The result for key "B" is ("B", 5)

Look at the code:

Import Org.apache.spark.api.java.JavaPairRDD;
Import Org.apache.spark.api.java.JavaSparkContext;
Import Org.apache.spark.api.java.function.Function2;
Import org.apache.spark.sql.SparkSession; Import Scala.

Tuple2;
Import java.util.ArrayList;
Import java.util.List;

Import Java.util.Map;
 /** * @author Wuweifeng wrote on 2018/4/18. */public class Test {public static void main (string[] args) {sparksession sparksession = Sparksession.build
        ER (). AppName ("Javawordcount"). Master ("local"). Getorcreate ();
        The reduce operation of Spark to the ordinary list javasparkcontext Javasparkcontext = new Javasparkcontext (Sparksession.sparkcontext ());
        list<tuple2<string, integer>> data = new arraylist<> ();
        Data.add (New tuple2<> ("A", 10));
        Data.add (New tuple2<> ("A", 20));
        Data.add (New tuple2<> ("B", 2));
        Data.add (New tuple2<> ("B", 3));

        Data.add (New tuple2<> ("C", 5)); Javapairrdd<string, Integer>
        Originrdd = javasparkcontext.parallelizepairs (data); The initial value is 2, then the 2 will be a function with the first element, and the result will be combined with the next element map map = Originrdd.foldbykey (2, New Function2<integer, Integer
                , integer> () {@Override public integer call (integer v1, Integer v2) throws Exception {
            Return v1 * v2;

        }}). Collectasmap ();
    {a=400, c=10, b=12} System.out.println (map); }
}
Note that the Zerovalue only evaluates with the first value of the same key, not all value.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.