Spark Scala Demo

Source: Internet
Author: User
Tags save file
Sparkcontext Create
Val conf = new sparkconf (). Setappname ("AppName")
val sc = new Sparkcontext (conf)

Read HDFs file
Sc.textfile (PATH)

The Textfile parameter is a path, which can be:
1. A file path where only the specified file is loaded
2. A directory path in which only all files under the specified directory (excluding files under subdirectories) are loaded
3. Load multiple files in the form of wildcards or load all files under multiple directories 4. Precede path with file://to read from the local file system, add hdfs://in front of path to read from the HDFs file system, and read the file from HDFs by default
Save File
Saveastextfile (PATH)

def saveastextfile (path:string): Unit
def saveastextfile (path:string, codec:class[_ <: Compressioncodec]): Unit
The saveastextfile is used to store the RDD in a text file format into the file system.
The codec parameter can specify a compressed class name.
Saveastextfile ("hdfs:///tmp/test/", Classof[com.hadoop.compression.lzo.lzopcodec])  
Add file://in front of path to read from the local file system, add hdfs://in front of path to read from the HDFs file system, read the file from HDFs by default

classification and function of spark operators value type transformation operator input partition and output partition one-to- one Map FlatMap mappartitions Glom
input partition and output partition many-to-one type
Union Cartesian input partition and output partition Many-to-many types GroupBy output partition as input partition subset type Filter distinct Subtract Sample Takesample Cache Type Cache persist key-value Type transformation operator input partition and output partition one-to-one mapvalues aggregation to a single rdd or two Rdd Single Rdd aggregation Combinebykey Reducebykey Partitionby aggregation of two Rdd Cogroup Connection Join Leftoutjoin and Rightoutjoin actions operator No output foreach HDFS Saveastextfile Saveasobjectfile Scala collections and data types Collect Collectasmap reducebykeylocally Lookup Count Top Reduce Fold Aggregate









Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.