The difference between map and FlatMap in Spark is seen through an experiment.
Step one: Put the test data on the HDFs
Hadoopdfs-put Data1/test1.txt/tmp/test1.txt
The test data has two lines of text:
Step Two: Create an RDD in spark to read the HDFs file/tmp/test1.txt
Step three: View the return value of the map function
Get the RDD returned by the map function:
View the return value of the map function--each row of data in the file returns an array object
Step four: View the return value of the Flatmap function
Get the RDD returned by the Flatmap function:
View the return value of the Flatmap function--all row data in the file returns only one array object
Summarize:
-The Map function in Spark specifies the action for each input and returns an object for each input;
-While the Flatmap function is a set of two operations-it is "flattened after first mapping":
Action 1: The same as the map function: Specify the operation for each input and return an object for each input
Action 2: Finally merge all objects into one object
The difference between map and FlatMap in Spark