1. Prepare test Data Set
First create a text file, as we test the data, the contents are as follows: except the third column and the fourth column is empty, the rest is the TAB key between the columns, the first row third column and fourth list 1 spaces, the 2nd row third column and fourth listed 2 spaces, the 3rd row third column and the fourth list 3 spaces, The third and fourth columns of line 4th are 4 spaces:
2.\\s matches any number of whitespace characters
Val rdd7 =sc.textfile ("G:\\zhengze.txt")
Val rdd9 = Rdd7.flatmap (_.split ("\\s"))
println (RDD9)
rdd9.foreach (x => Print (x+ "*"))
The result is as shown in figure:
3.\\s matches any character that is not blank
Val rdd7 =sc.textfile ("G:\\zhengze.txt")
val rdd9 = Rdd7.flatmap (_.split ("\\s"))
println (RDD9)
Rdd9.foreach (x => Print (x+ "*"))
The results are shown in the following illustration:
4.\\d matches any number of numbers
Val rdd7 =sc.textfile ("G:\\zhengze.txt")
val rdd9 = Rdd7.flatmap (_.split ("\\d"))
println (RDD9)
Rdd9.foreach (x => Print (x+ "*"))
The results are shown in the following illustration:
5.\\d matches any number of non-numeric characters
The result is as shown in the figure: