Scala is one of the most powerful programming languages in the field of data mining algorithms, and the language itself is function oriented, which also conforms to the common scenarios of data mining algorithms: Applying a series of transformations on the original data set, the language itself also provides many powerful functions for the collection operation, this article will take the list type as an example, Describes common collection transformation operations.
First, the commonly used operators (operators are actually functions)
+ + ++[b] (That:gentraversableonce[b]): list[b] Add another list from the tail of the list
+ +: ++:[b;: A, that] (that:collection. TRAVERSABLE[B]) (implicit bf:canbuildfrom[list[a], B, that]): This adds a list to the head of the list
+: +:(elem:a): List[a] Add an element to the head of the list
: +: + (ELEM:A): List[a] Add an element at the end of the list
::::(x:a): List[a] Add an element to the head of the list
::::::(Prefix:list[a]): list[a] Add another list to the head of the list
: \: [b] (Z:B) (OP: (A, B) ⇒b): B is equivalent to Foldright
val left = List(1,2,3)val right = List(4,5,6)//以下操作等价left ++ right // List(1,2,3,4,5,6)left ++: right // List(1,2,3,4,5,6)right.++:(left) // List(1,2,3,4,5,6)right.:::(left) // List(1,2,3,4,5,6)//以下操作等价0 +: left //List(0,1,2,3)left.+:(0) //List(0,1,2,3)//以下操作等价left :+ 4 //List(1,2,3,4)left.:+(4) //List(1,2,3,4)//以下操作等价0 :: left //List(0,1,2,3)left.::(0) //List(0,1,2,3)
See here everyone should be like me a bit dizzy, how so many strange operators, here is a hint, any operator with a colon result, is right bound, that is, 0:: List (All-in-one) = List (All-in-one).::(0) = List (0,1,2,3) Here you can see the operation:: Actually the right list operator, not the left int type operator
Second, the common transformation operation
1.map
Map[b] (f: (A) ⇒b): List[b]
Define a transformation that applies the transformation to each element of the list, and the original list is unchanged, returning a new list of data
Example1 Square Transformation
val nums = List(1,2,3)val square = (x: Int) => x*x val squareNums1 = nums.map(num => num*num) //List(1,4,9)val squareNums2 = nums.map(math.pow(_,2)) //List(1,4,9)val squareNums3 = nums.map(square) //List(1,4,9)
Example2 save a few columns in the text data
val text = List("Homeway,25,Male","XSDYM,23,Female")val usersList = text.map(_.split(",")(0)) val usersWithAgeList = text.map(line => { val fields = line.split(",") val user = fields(0) val age = fields(1).toInt (user,age)})
2.flatmap, flatten
flatten: FLATTEN[B]: list[b] Flattening the list of lists Flatmap:flatmap[b] (f: (A) ⇒gentraversableonce[b]): list[b] Map Results Flatten
Define a transform F, each element of the F application list, each F returns a list, and finally links all the lists together
val Text = List ("A,b,c", "d,e,f")
val textmapped = Text.map (_.split (","). toList)//List (list ("A", "B", "C"), List ("D", "E", "F"))
val textflattened = Textmapped.flatten//List ("A", "B", "C", "D", "E", "F")
val textflatmapped = Text.flatmap (_.split (","). ToList)//List ("A", "B", "C", "D", "E", "F")
3.reduce
REDUCE[A1;: A] (OP: (A1, A1) ⇒a1): A1
Define a transform F, f to synthesize two elements of a list, iterate through the list, and eventually merge the list into a single element
example list sum
val sum1 = Nums.reduce ((a, b) = a+b)//6
val sum2 = Nums.reduce (_+_)//6
4.reduceleft,reduceright
Reduceleft:reduceleft[b;: A] (f: (b, A) ⇒b): B
reduceright:reduceright[b : A] (OP: (A, B) ⇒b): B
reduceleft applies the reduce function from the left side of the list to the right, Reduceright apply the Reduce function
example:
val nums = List (2.0,2.0,3.0)
val resultleftreduce = Nums.reduceleft ( MATH.POW)//= POW (POW (2.0,2.0), 3.0) = 64.0
val resultrightreduce = nums.redu Ceright (MATH.POW)//= POW (2.0, pow (2.0,3.0)) = 256.0
5.fold,foldleft,foldright
fold: FOLD[A1: A] (Z:A1) (OP: (A1, A1) ⇒a1): A1 with initial values, starting with an initial value, merging two elements from left to right into one, and eventually merging the list into a single element.
foldleft:foldleft[b "(Z:b) (f: (b, A) ⇒b): B reduceleft
with initial values Span style= "FONT-SIZE:18PX;" >FOLDRIGHT:FOLDRIGHT[B] (Z:B) (OP: (A, B) ⇒b): B reduceright
val Nums = List (2,3,4)
val sum = nums.fold (1) (_+_)//= 1+2+3+4 = 9
Span style= "FONT-SIZE:18PX;" >val nums = List (2.0,3.0)
val result1 = Nums.foldleft (4.0) (MATH.POW)//= POW ( Pow (4.0,2.0), 3.0) = 4096
val result2 = Nums.foldright (1.0) (MATH.POW)//= POW (1.0,pow (2.0,3.0)) = 8.0
6.sortby,sortwith,sorted
sortBy: Sortby[b] (f: (A) ⇒b) (Implicit Ord:math. ORDERING[B]): list[a] Sort
sortwith:sortwith (LT: (A, a) ⇒boolean): List[a] Sort by using a custom comparison function
val nums = List (1,3,2,4)
val sorted = nums.sorted//list (1,2,3,4)
val users = List (("Homeway", 25 ), ("Xsdym", +))
val sortedwith = Users.sortwith{case ( USER1,USER2) = User1._2 < user2._2}//list (("Xsdym", at $), ("Homeway", +)
7.filter, Filternot
Filter:filter (P: (A) ⇒boolean): List[a]
Filternot:filternot (P: (A) ⇒boolean): List[a]
Filter preserves list elements in the list that meet the criteria P, Filternot, and preserves list elements in the list that do not meet the criteria P
Val Odd = Nums.filter (_% 2! = 0)//List (1,3)
Val even = Nums.filternot (_% 2! = 0)//List (2,4)
8.count
count (P: (A) ⇒boolean): Int
calculates the number of all elements in the list that satisfy the condition p, equivalent to filter (p). Length
val nums = List ( -1,-2,0,1,2)
val plusCnt1 = Nums.count (> 0)
val plusCnt2 = Nums.filter (> 0). length
9. diff, Union, intersect
Diff:diff (that:collection. Seq[a]): List[a] Saves the elements in the list that are not in another list, that is, subtracting the intersection from the collection from the other set
union:union (that:collection. Seq[a]): list[a] link to another list
Intersect:intersect (that:collection. Seq[a]): List[a] intersection with another set
val nums1 = List ( all in all)
val nums2 = List (2,3,4)
val diff1 = nums1 diff NUMS2//List (1)
val diff2 = Nums2.diff (NUM1)//List (4)
val union1 = nums1 Union nums2//List (1,2,3,2,3,4)
val union2 = nums2 + nums1//List (2,3,4,1,2,3)
val intersection = nums1 intersect nums2//list (2,3)
10.distinct
Distinct:list[a] preserves the non-repeating elements of the list, and the same elements are retained only once
Val distincted = list.distinct//List ("A", "B", "C")
11.groupby, grouped
groupby: Groupby[k] (f: (A) ⇒k): Map[k, List[a]] group the list by applying the new elements that f generates after the element is
val data = List (("Homeway", "Male"), ("Xsdym", "FEMAIL"), ("Mr.wang", "Male"))
val group1 = Data.groupby (_._2)//= Map ("Male"-List (("Homeway", "Male"), "Mr.wang", "Male"), "Female", List (" Xsdym "," FEMAIL ")))
val group2 = Data.groupby{case (name,sex) + sex}//= Map ("Male" ("Homeway", "Male"), ("Mr.wang", "Male"), "Female", List (("Xsdym", "FEMAIL"))
Span style= "FONT-SIZE:18PX;" >val Fixsizegroup = data.grouped (2). ToList//= Map ("Male" ("Homeway", "Male"), ("Xsdym", "FEMAIL"), "Female"- > List (("Mr.wang", "Male"))
12.scan
scan[b;: A, that] (Z:B) ( OP: (b, b) ⇒b) (implicit cbf:canbuildfrom[list[a], B, that]): that
val nums = List (all in a)
val result = Nums.scan (_+_)//list (10,10+1,10+1+2,10+1+2+3) = List (10,11,13,16)
13.scanleft,scanright
scanleft: Scanleft[b, that] (Z:B) (OP: (B, A) ⇒b) (implicit bf:canbuildfrom[list[a], B, this]): that
scanright:scanright[b, that] (Z:B) (OP: (A, B) ⇒b) (implicit bf:canbuildfrom[list[a], B, this]): that
scanleft: operation of the scan function from left to right, Scanright: operation of the scan function from right to left
val nums = List ( 1.0,2.0,3.0)
val result = Nums.scanleft (2.0) (MATH.POW)//List (2.0,pow ( 2.0,1.0), pow (POW (2.0,1.0), 2.0), pow (POW (POW (2.0,1.0), 2.0), 3.0) = List (2.0,2.0,4.0,64.0)
val result = Nums.scanright (2.0) (MATH.POW)//List (2.0,pow (3.0,2.0), pow (2.0,pow (3.0,2.0)), Pow (1.0,pow (2.0,pow ( 3.0,2.0)) = List (1.0,512.0,9.0,2.0)
14.take,takeright,takewhile
take: Takeright (N:int): List[a] Extract the first n elements of the list takeright:takeright (n:int): List[a] Extracts the last n elements of the list Takewhile:takewhile (P: (A) ⇒bool EAN): List[a] Extracts the elements of the list from left to right until the condition P is not established
val nums = List (1,1,1,1,4,4,4,4)
val left = Nums.take (4)//List (1,1,1,1)
val right = Nums.takeright (4)//List (4,4,4,4)
val headnums = Nums.takewhile (_ = = Nums.head)//List (1,1,1,1)
15.drop,dropright,dropwhile
drop: Drop (N:int): List[a] Discards the first n elements, returns the remaining elements Dropright:dropright (N:int): List[a] Discards the last n elements, and returns the remaining elements Dropwhile:dropwhile (P: (A) ⇒ Boolean): List[a] discards elements from left to right until condition p is not set
val nums = List (1,1,1,1,4,4,4,4)
val right = Nums.dropright (4)//List (1,1,1,1)
val tailnums = Nums.dropwhile (_ = = Nums.head)//List (4,4,4,4)
16.span, Splitat, partition
span: Span (P: (A) ⇒boolean): (List[a], list[a]) apply condition p from left to right until condition p is not set, and the list is divided into two lists
splitat:splitat (N:int): (List[a], List[a]) divides the list into the first n, and the rest of the section
partition:partition (P: (A) ⇒boolean): (List[a], List[a]) divides the list into two parts, the first part is the element that satisfies the condition p, The second part is the element that does not satisfy the condition p
val nums = List (1,1,1,2,3,2,1)
val (Prefix,suffix) = Nums.span (_ = = 1)//prefix = list (1,1,1), suffix = list (2,3,2,1)
val (Prefix,suffix) = Nums.splitat (3)//prefix = list (1,1,1), suffix = list (2,3,2,1)
val (Prefix,suffix) = Nums.partition (_ = = 1)//prefix = list (1,1,1,1), suffix = list (2,3,2)
17.padto
padto (Len:int, elem:a): List[a]
Expand the list to a specified length, and when the length is not enough, fill it with elem, or do nothing.
val nums = List (1,1,1)
val Padded = nums.padto (6,2)//List (1,1,1,2,2,2)
18.combinations,permutations
Combinations:combinations (N:int): Iterator[list[a]] takes a combination of n elements from a list, returns a list of combinations that are not duplicated, and results in an iterator
Permutations:permutations:iterator[list[a]] Arranges the elements in the list, returns a list of permutations that are not heavy, and results in an iterator
Val permutations = nums.permutations.toList//list (list (1,1,3), List (1,3,1), List (3,1,1))
19.zip, Zipall, Zipwithindex, UNZIP,UNZIP3
Zip:zip[b] (that:geniterable[b]): list[(A, B)] with another list to zip operation, the corresponding position of the elements of a pair, the returned list length of two lists of the short one
Zipall:zipall[b] (that:collection. Iterable[b], thiselem:a, thatelem:b): list[(A, B)] with another list of zipper operation, the corresponding position of the elements of a pair, if the list length is inconsistent, the list of itself is relatively short, use Thiselem to fill, If the list is shorter, fill it with Thatelem
zipwithindex:zipwithindex:list[(A, Int)] zips the list element with its index, forming a pair
unzip:unzip[a1, A2] (implicit aspair: (A) ⇒ (A1, A2)): (LIST[A1], LIST[A2]) Undo Zipper Operation
unzip3:unzip3[a1, A2, A3] (implicit astriple: (A) ⇒ (A1, A2, A3)): (LIST[A1], list[a2], LIST[A3]) 3-element zipper operation
val alphabet = List ("A", B "," C ")
val nums = List ( up to)
val zipped = Alphabet zip Nums//List (("A", 1), ("B", 2))
val zippedall = Alphabet.zipall (Nums, "*", -1)//List (("A", 1), ("B", 2), ("C", -1))
val zippedindex = alphabet.zipwithindex//List (("A", 0), ("B", 1), ("C", 3))
val (list1,list2) = zipped.unzip//List1 = List ("A", "B"), List2 = List
val (l1,l2,l3) = List ((1, "One", ' 1 '), (2, "one", ' 2 '), (3, "three", ' 3 ')). UNZIP3//L1=list (All-in-one), L2=list ("One", "the" , "three"), L3=list (' 1 ', ' 2 ', ' 3 ')
20.slice
Slice (From:int, until:int): List[a] Extracts list of elements from position from to position until (excluding this location)
Val sliced = Nums.slice (2,4)//list (3,4)
21.sliding
Sliding (Size:int, Step:int): Iterator[list[a]] Groups The list by a fixed size of sizes, stepping to step,step default to 1, and returning the result as an iterator
Val groupStep1 = nums.sliding (2). ToList//list (list, list), List (2,2), list (2,3), List (3,3), list (3,4), list ( bis))
22.updated
Updated (Index:int, elem:a): list[a] Update an element in the list
Val fixed = nums.updated (3,4)//List (1,2,3,4)
Scala operator and collection conversion operations examples