spark2.x deep into the end series six of the RDD Java API call Scala API principle

Source: Internet
Author: User

The RDD Java API is actually called the Scala API, so we need to get to the Java API to invoke the Scala API, we first implement a Scala version and Java version of the RDD and Sparkcontext


A simple implementation of the Scala version of Rdd and Sparkcontext

Class Rdd[t] (Value:seq[t]) {//rdd map Operation def map[u] (f:t = U): rdd[u] = {new RDD (Value.map (f))} def Iterat Or[t] = Value.iterator}class sparkcontext {//Create an RDD def createrdd (): Rdd[integer] = new Rdd[integer] (Seq (1, 2, 3))}


Second, simple implementation of Java version of Rdd and Sparkcontext

This time in Java an interface//We can use the map in Scala needs a function that corresponds to an interface in Java package com.twq.javaapi.java7.function;public  Interface function<t1, r> extends serializable {  r call (T1 &NBSP;V1)  throws exception;} The Java version of RDD and Sparkcontext implemented here is actually implemented in Scala code, except that the Scala code can be called Import java.util by Java code. {iterator => jiterator}import scala.collection.javaconverters._import  Com.twq.javaapi.java7.function. {function => jfunction}//Each javardd will contain a Scala rdd to invoke the apiclass javardd[t of the Rdd] (Val rdd :  rdd[t])  {  def map[r] (F: jfunction[t, r]): javardd[r] =     //here is the key, call the map method in Scala rdd     //we construct the Java interface into a scala  The Rdd map requires a function function     new javardd (Rdd.map (X => f.call (x)))   // We need to turn Scala's iterator into the Java version of ITERATOR&NBSP;&NBSP;DEF&NBSP;ITERATOR:&NBSP;JITERATOR[T]&NBSP;=&NBSP;RDD.ITERATOR.Asjava}//Each javasparkcontext contains a scala version of Sparkcontextclass javasparkcontext (Sc: sparkcontext)  {   def this ()  = this (New sparkcontext ())   // Transpose Scala version of Sparkcontext to implement Javasparkcontext functionality   def createrdd ():  javardd[integer] =  new javardd[integer] (Sc.createrdd ())}

Third, write Java code call RDD Java API

package com.twq.javaapi.java7;import com.twq.javaapi.java7.function.function;import  com.twq.rdd.api.javardd;import com.twq.rdd.api.javasparkcontext;import java.util.iterator;/** *  Created by tangweiqun on 2017/9/16. */public class  Selfimpljavarddtest {    public static void main (String[] args)  {        //Initialize javasparkcontext         javasparkcontext jsc = new javasparkcontext ();         //calls Javasparkcontext's API to create a Rdd        javardd <integer> firstrdd = jsc.createrdd ();         // Map operations in Javardd are applied to the created Firstrdd         javardd<string> strrdd  = firstrdd.map (new function<inTeger, string> ()  {            @ Override            public string call ( INTEGER&NBSP;V1)  throws Exception {                 return v1 +  "Test";             }        });         //will get the results of the RDD printed, the result for         //1test         //2test        //3test         iterator<string> result = strrdd.iterator ();         while  (Result.hasnext ())  {          &Nbsp;  system.out.println (Result.next ());        }     }}


The above is the implementation of the RDD Java API call Scala API, although only the map operation, but other similar to the implementation of the FLATMAP operation is similar


The next step is to learn more about each of the Rdd Java APIs.


We can take the spark core RDD API to get a detailed understanding of every API in Scala ...

spark2.x deep into the end series six of the RDD Java API call Scala API principle

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.