1> mainfest Context Definition
1. In Scala, arrays have to be typed, and if they are directly generic, they will get an error, which introduces the manifest context definition, requires a mainfest[t] object, and Mainifest[t] has an implicit value,
2. If Makepair is called, the compiler navigates to an implicit manifst[int] and actually calls Makepair (2,3) (intmanifest), which is called by the new Array (2) (intmanifest), Returns the base type of the array int[2]
3. An implicit manifest type is
all elements of this RDD.*/Def map [U: ClassTag] (f: T => U): RDD [U] = new MappedRDD (this, SC. clean (f ))/*** Return a new RDD by first applying a function to all elements of this* RDD, and then flattening the results.*/Def flatMap [U: ClassTag] (f: T => TraversableOnce [U]): RDD [U] =New FlatMappedRDD (this, SC. clean (f ))/*** Return a new RDD containing only the elements that satisfy a predicate.*/De
Zip
def Zip[u] (Other:rdd[u]) (implicit arg0:classtag[u]): rdd[(T, U)]
The ZIP function is used to synthesize two RDD groups into an rdd in the form of Key/value, where the partition number of the default two Rdd and the number of elements are the same, otherwise an exception will be thrown.
scala> var rdd1 = Sc.makerdd (1 to 10,2) rdd1:org.apache.spark.rdd.rdd[int] = parallelcollectionrdd[0] at MakeRDD at:2 1 scala> var rdd1 = Sc.makerdd (1 to 5,2)
mappartitions is applied to each partition, that is, the contents of each partition are treated as a whole.Its function is defined as:def mapPartitions[U: ClassTag](f: Iterator[T] => Iterator[U], preservesPartitioning: Boolean = false): RDD[U]F is the input function, which processes the contents of each partition. The contents of each partition will be passed as iterator[t] to the input function f,f the output is iterator[u]. The final Rdd is combine
path where the image data resides# mnist-image/train/# mnist-image/test/# CLA: Category name# 0,1,2,..., 9# return: All data for a category----[Sample quantity * (image width × image height)] Matrixdef read_and_convert (imgfilelist):DataLabel = [] # Store class labelDatanum = Len (imgfilelist)Datamat = Np.zeros ((datanum, +)) # Datanum * 400 matrixFor I in Range (Datanum):IMGNAMESTR = Imgfilelist[i]Imgname = Get_img_name_str (imgnamestr) # Gets the number _ instance number. png#print ("Imgname:
regular expression used to break down selectors, see below childre=/^\s*>/, Classtag= ' Zepto ' + (+NewDate ())functionProcess (SEL, FN) {//The decomposition selector is three parts, the first part is the selector itself, the second part is the value of the selector, the function name in the filter, and the third part is the parameter ///For example: (1)filterre.exec (": eq (2)") //Results obtained:[": Eq (2)", "", "eq", "2"] //(2)filterre.e
the edge is located.The first step based on the point hash to find the edge of the location of the process is similar to a query to build the index.With the official map of understanding:Efficient Data StructuresThere is a better data structure support for native type storage and reading and writing, typically the map used in edgepartition :/** * A fast hash map implementation for primitive, non-null keys. This hash map supports * insertions and updates, but not deletions. This map is about an
Mappartitionswithcontext, which can pass some state information from the process to the user-specified input function. There is also Mappartitionswithindex, which can pass the index of the partition to the user-specified input function.MapvaluesMapvalues as the name implies is that the input function is applied to the Kev-value value in the RDD, the key in the original RDD remains unchanged, and the new value is composed of the elements in the new Rdd. Therefore, the function applies only to th
=> line. Split (""))
Flatmap converts the original mappedrddFlatmappedrdd
def flatMap[U: ClassTag](f: T => TraversableOnce[U]): RDD[U] = new FlatMappedRDD(this, sc.clean(f))Step 3: Val wordcount = splittedtext. Map (WORD => (word, 1 ))
Use Word to generate corresponding key-value pairs. The flatmappedrdd in the previous step is converted to mappedrdd.Ste
-yfields (0). ToInt if (ret = = 0) {ret = YFields (1). Toint-xfields (1). ToInt} ret}}, classtag.object.asinstanceof[classtag[string]) */ The second way to use SortBy is to convert the original data--->sortby () The function of the first parameter is to do the conversion of the data val retrdd:rdd[string] = Linesrdd.sortby (line = Gt {//F: (t) = K//Here the type of T is String,k is Secondarysort type val fields = Line.split ("") Val
is, the paritition operation, is critical for spark computing. It is precisely because different partition operations make parallel processing possible.PartitionstrategyThe benefits are different. Hash is used to divide the entire graph into multiple regions.
Outer Join Operation of outerjoinvertices Vertex
Graph operations and operations graphops
The common algorithms of graphs are abstracted to the graphops class in a centralized way. They are implicitly converted to graphops in graph.
impli
ordering[t], i.e. descending, just opposite [takeordered]
def top (Num:int) (implicit ord:ordering[t]): array[t] = withscope {Takeordered (num) (ord.reverse)}
8) Saveastextfile function to save the RDD as a text file
def saveastextfile (path:string): Unit = withscope {Val Nullwritableclasstag = implicitly[classtag[nullwritable]]Val Textclasstag = Implicitly[classtag[text]]Val r = th
inherit from the TagSupport or Bodytagsupport classes, which can be found in the Javax.servlet.jsp.tagext package.When the JSP engine sees that its JSP page contains a tag tag, it calls the doStartTag method to handle the beginning of the tag tag and calls the Doendtag method to handle the end of the tag tag.The following table describes the different processing procedures required for different types of tags:Method of tag handling classTag Tag typeT
Listen to Liaoliang's Spark 3000 disciple series lesson four Scala pattern matching and type parameters, summarized as follows:Pattern matching:def data (array:array[string]) {Array match{Case Array (a,b,c) = println (A+b+c)Case Array ("Spark", _*) =//matches an array with spark as the first elementCase _ = ...}}After-school assignments are:Read the source code for the spark source RDD, Hadooprdd, Sparkcontext, Master, and worker, and analyze the contents of all pattern matching and type paramet
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.