Flatmap function usage in Spark--spark learning (Basic)

Source: Internet
Author: User
Tags foreach
Description

In Spark, the map function and the Flatmap function are two more commonly used functions. which
Map: operates on each element in the collection.
FLATMAP: operates on each element in the collection and then flattens it.
Understanding flattening can give a simple example

Val arr=sc.parallelize (Array ("A", 1), ("B", 2), ("C", 3))
Arr.flatmap (x=> (x._1+x._2)). foreach (println)

The output result is

A
1
B
2
C
3

If you use map

Val arr=sc.parallelize (Array ("A", 1), ("B", 2), ("C", 3))
Arr.map (x=> (x._1+x._2)). foreach (println)

Output results

A1
B2
C3

So the Flatmap flat is probably the first time you use a map to map all the data once again. Actual usage Scenarios

This scenario is one of the challenges I have encountered in writing code, how many occurrences of adjacent character pairs are counted in the string. It means that if there is a; B C;d; B C string, then (A, B), (C,d), (D,b) the adjacent character pair appears once, (B,c) appears two times.
If you have data

A B C;d; B;d; C
b;d; A E;d; C
A; B

The number of occurrences of adjacent character pairs appears as follows

Data.map (_.split (";")). FlatMap (x=>{
      for (i<-0 until x.length-1) yield (x (i) + "," +x (i+1), 1)
    }). Reducebykey (_+_). foreach ( println

The output result is

(a,e,1)
(e,d,1)
(d,a,1)
(c,d,1)
(b,c,1)
(b,d,2)
(d,c,2)
(d,b,1)
(a,b,2)

This example is the full use of the flatmap of the flat function.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.