Scala is one of the most powerful programming languages in the field of data mining algorithms, language itself is a function-oriented, which also conforms to the common scene of data mining algorithms: The use of a series of transformations on the original dataset, the language itself also provides a number of powerful functions of the set operation, this article will be the list type as an example, Describes common collection transformation operations. first, the common operators (operators are actually functions)
+ + ++[b] (That:gentraversableonce[b]): list[b] Add another list from the end of the list
+ +: ++:[b: A, that] (that:collection. TRAVERSABLE[B]) (implicit bf:canbuildfrom[list[a], B, that) £ º Add a list to the head of the list
+: +:(elem:a): List[a] Add an element to the head of the list
: +: + (ELEM:A): List[a] Add an element to the tail of the list
::::(x:a): List[a] Add an element to the head of the list
:::::(Prefix:list[a]): list[a] Add another list to the head of the list
: \: [b] (Z:B) (OP: (A, B) ⇒b): B and Foldright equivalent
Val left = list (1,2,3)
val right = List (4,5,6)
//following operation equivalence left
+ + right //list (1,2,3,4,5,6) left
+ +: r ight //List (1,2,3,4,5,6)
right.++:(left) //list (1,2,3,4,5,6) right
.:::(left) //list ( 1,2,3,4,5,6)
//The following operations are equivalent
0 +: Left //list (0,1,2,3)
left.+:(0) //list (0,1,2,3)
// The following operations are equivalent to left
: + 4 //list (1,2,3,4)
left.:+ (4) //list (1,2,3,4)
//The following operation equivalence
0:: Left List (0,1,2,3) left
::(0) //list (0,1,2,3)
See here everyone should be a little dizzy like me, how so many strange operators, here to give you a hint, any operator with a colon result is right bound, that is, 0:: List (1,2,3) = List (1,2,3).::(0) = List (0,1,2,3) You can see from here that the operation: is actually the operator of the right-hand list, not the operator of the left int type Two, common transformation operations
1.map
Map[b] (f: (A) ⇒b): List[b]
Defines a transformation that applies the transform to each element of the list, and the original list is unchanged, returning a new list of data
Example1 Square Transform
Val nums = List (1,2,3)
val square = (x:int) => x*x
val squareNums1 = nums.map (num => num*num) //list (1, 4,9)
val squareNums2 = Nums.map (Math.pow (_,2)) //list (1,4,9)
val squareNums3 = nums.map (square) // List (1,4,9)
Example2 save a few columns in the text data
Val Text = List ("Homeway,25,male", "Xsdym,23,female")
val userslist = Text.map (_.split (",") (0))
Val Userswithagelist = Text.map (line => {
val fields = Line.split (",")
val user = fields (0)
val age = Fields ( 1). ToInt
(user,age)
})
2.flatMap, Flatten
FLATTEN:FLATTEN[B]: list[b] To flatten a list of lists Flatmap:flatmap[b] (f: (A) ⇒gentraversableonce[b]): List[b] Flatten the results after the map
Define a transform F, apply f to each element of the list, and each F returns a list, eventually linking all the lists.
Val Text = list ("A,b,c", "D,e,f")
val textmapped = Text.map (_.split (","). ToList)//List (list ("A", "B", "C"), List ("D") , "E", "F"))
val textflattened = Textmapped.flatten //List ("A", "B", "C", "D", "E", "F")
val textflatmapped = Text.flatmap (_.split (","). ToList)//List ("A", "B", "C", "D", "E", "F")
3.reduce
REDUCE[A1: A] (OP: (A1, A1) ⇒a1): A1
Define a transform F, f to synthesize a two-list element, iterate through the list, and eventually merge the list into a single element
Example list sum
Val nums = List (1,2,3)
val sum1 = Nums.reduce ((a,b) => a+b) //6
val sum2 = nums.reduce (_+_) //6
V Al sum3 = Nums.sum //6
4.reduceleft,reduceright
Reduceleft:reduceleft[b: A] (f: (B, A) ⇒b): b
Reduceright:reduceright[b: A] (OP: (A, B) ⇒b): B
Reduceleft apply the reduce function from the left of the list to the right, reduceright apply the reduce function from the right side of the list to the left
Example
Val nums = List (2.0,2.0,3.0)
val resultleftreduce = Nums.reduceleft (MATH.POW) //= POW (POW (2.0,2.0), 3.0) = 64.0
val resultrightreduce = nums.reduceright (MATH.POW)//= POW (2.0, pow (2.0,3.0)) = 256.0
5.fold,foldleft,foldright
FOLD:FOLD[A1: A] (Z:A1) (OP: (A1, A1) ⇒a1): A1 with the initial value of reduce, starting from an initial value, from left to right to merge two elements into one, and eventually merge the list into a single element.
FOLDLEFT:FOLDLEFT[B] (Z:B) (f: (b, A) ⇒b): B reduceleft with initial value
FOLDRIGHT:FOLDRIGHT[B] (Z:B) (OP: (A, B) ⇒b): B reduceright with an initial value
Val nums = list (2,3,4)
val sum = nums.fold (1) (_+_) //= 1+2+3+4 = 9
val nums = List (2.0,3.0)
val result1 = Nums.foldleft (4.0) (MATH.POW)//= POW (POW (4.0,2.0), 3.0) = 4096
val result2 = Nums.foldright (1.0) (MATH.POW)//= POW ( 1.0,pow (2.0,3.0) = 8.0
6.sortby,sortwith,sorted
Sortby:sortby[b] (f: (A) ⇒b) (Implicit Ord:math. ORDERING[B]): List[a] Sorted according to the elements produced after the function f is applied
Sorted:sorted[b: A] (implicit ord:math. ORDERING[B]): List[a] Sort by the element itself
Sortwith:sortwith (LT: (A, a) ⇒boolean): List[a] Use a custom comparison function to sort
Val nums = list (1,3,2,4)
val sorted = nums.sorted //list (1,2,3,4)
val users = List ("Homeway",), ("Xsdym", )
val sortedbyage = users.sortby{case (user,age) => age} //list (("Xsdym",), ("Homeway"),
Val Sortedwith = Users.sortwith{case (user1,user2) => user1._2 < user2._2}//list (("Xsdym",), ("Homeway", 25))
7.filter, Filternot
Filter:filter (P: (A) ⇒boolean): List[a]
Filternot:filternot (P: (A) ⇒boolean): List[a]
Filter preserves list elements in the list that match the condition p, filternot, list elements that do not conform to the condition p in the list
Val nums = List (1,2,3,4)
val odd = Nums.filter (_% 2!= 0)//list (1,3)
val even = Nums.filternot (_% 2!= 0)// List (2,4)
8.count
Count (P: (A) ⇒boolean): Int
Calculates the number of all elements in the list that satisfy the condition p, equivalent to filter (p). length
Val nums = List ( -1,-2,0,1,2) Val plusCnt1 = Nums.count (> 0) Val plusCnt2 = Nums.filter (> 0). length
9. diff, Union, intersect
Diff:diff (that:collection. Seq[a]): List[a] Save the elements in the list that are not in another list, that is, subtract from the collection the intersection of another set
Union:union (that:collection. Seq[a]): list[a] link to another list
Intersect:intersect (that:collection. Seq[a]): List[a] intersection with another set
Val nums1 = list (1,2,3)
val nums2 = List (2,3,4)
val diff1 = nums1 diff nums2 //List (1)
val diff2 = nums2. diff (NUM1) //List (4)
val union1 = nums1 Union nums2 //List (1,2,3,2,3,4)
val union2 = nums2 + + nums1 List (2,3,4,1,2,3)
val intersection = nums1 intersect nums2//list (2,3)
10.distinct
Distinct:list[a] Preserves the elements that are not duplicates in the list, and the same elements are preserved only once
Val list = list ("A", "B", "C", "A", "B") val distincted = list.distinct//List ("A", "B", "C")
11.groupBy, grouped
Groupby:groupby[k] (f: (A) ⇒k): Map[k, List[a]] to group the list by applying the new elements that are created after the F is applied to the element
Grouped:grouped (Size:int): Iterator[list[a]] Grouped by list in fixed size
Val data = List (("Homeway", "Male"), ("Xsdym", "FEMAIL"), ("Mr.wang", "Male"))
val group1 = data.groupby (_._2)//= Map ( "Male"-> list ("Homeway", "Male"), ("Mr.wang", "Male")), "Female"-> list ("Xsdym", "FEMAIL"))
val group2 = Data.groupby{case (name,sex) => sex}//= Map ("Male"-> List ("Homeway", "Male"), ("Mr.wang", "Male"), "Female"- > List (("Xsdym", "FEMAIL"))
val fixsizegroup = data.grouped (2). ToList//= Map ("Male"-> List ("Homeway", " Male "), (" Xsdym "," FEMAIL "))," Female "-> List (" Mr.wang "," Male "))
12.scan
Scan[b "A, that" (Z:b) (OP: (b, b) ⇒b) (implicit cbf:canbuildfrom[list[a], B, that)): that
Starting from an initial value, from left to right, the accumulation of OP operations, which is difficult to explain, concrete look at the example bar.
Val nums = list (1,2,3)
val result = Nums.scan (_+_) //List (10,10+1,10+1+2,10+1+2+3) = List (10,11,12,13)
13.scanleft,scanright
Scanleft:scanleft[b, that] (Z:B) (OP: (b, A) ⇒b) (implicit bf:canbuildfrom[list[a], B, that)): that
Scanright:scanright[b, that] (Z:B) (OP: (A, B) ⇒b) (implicit bf:canbuildfrom[list[a], B, which)): that
Scanleft: operation of scan function from left to right, Scanright: Scan function from right to left
Val nums = list (1.0,2.0,3.0)
val result = Nums.scanleft (2.0) (MATH.POW) //List (2.0,pow (2.0,1.0), pow (POW ( 2.0,1.0), 2.0), pow (POW (POW (2.0,1.0), 2.0), 3.0) = List (2.0,2.0,4.0,64.0)
val result = Nums.scanright (2.0) (Math.pow ) //List (2.0,pow (3.0,2.0), pow (2.0,pow (3.0,2.0)), Pow (1.0,pow (2.0,pow)) = List (3.0,2.0)
14.take,takeright,takewhile
Take:takeright (N:int): List[a] Extracts the first n elements of a list takeright:takeright (N:int): List[a] Extracts the last n elements of a list takewhile:takewhile (P: (A) ⇒boolean): List[a] Extracts the elements of the list from left to right until the condition P is not established
Val nums = List (1,1,1,1,4,4,4,4)
val left = Nums.take (4) //List (1,1,1,1)
val right = Nums.takeright (4)//Lis T (4,4,4,4)
val headnums = Nums.takewhile (_ = = Nums.head) //List (1,1,1,1)
15.drop,dropright,dropwhile
Drop:drop (N:int): List[a] Discards the first n elements, returns the remaining elements Dropright:dropright (N:int): List[a] Discards the last n elements, returns the remaining element Dropwhile:dropwhile (p : (A) ⇒boolean): List[a] Discard elements from left to right until condition P is not established
Val nums = List (1,1,1,1,4,4,4,4)
val left = Nums.drop (4) //List (4,4,4,4)
val right = Nums.dropright (4)//Lis T (1,1,1,1)
val tailnums = Nums.dropwhile (_ = = Nums.head) //List (4,4,4,4)
16.span, Splitat, partition
Span:span (P: (A) ⇒boolean): (List[a], list[a] apply condition p from left to right to judge until condition P is not established, at which point the list is divided into two lists
Splitat:splitat (N:int): (List[a], List[a]) divides the list into the top N, with the remainder
Partition:partition (P: (A) ⇒boolean): (List[a], List[a]) divides the list into two parts, the first part is the element that satisfies the condition p, and the second part is the element that does not satisfy the condition P
Val nums = list (1,1,1,2,3,2,1)
val (prefix,suffix) = Nums.span (_ = = 1)//prefix = list (1,1,1), suffix = list (2,3,2,1
val (prefix,suffix) = Nums.splitat (3) //prefix = list (1,1,1), suffix = list (2,3,2,1)
val (prefix,suffix) = Nums.partition (_ = = 1)//prefix = list (1,1,1,1), suffix = list (2,3,2)
17.padTo
Padto (Len:int, elem:a): List[a]
Extend the list to a specified length, with Elem for padding, or no action if the length is not sufficient.
Val nums = List (1,1,1)
val padded = nums.padto (6,2) //list (1,1,1,2,2,2)
18.combinations,permutations
Combinations:combinations (N:int): Iterator[list[a]] takes n elements from a list and returns a list of combinations that are not repeated, resulting in an iterator
Permutations:permutations:iterator[list[a]] Arranges the elements in a list and returns a list of not-so-important permutations, resulting in an iterator
Val nums = List (1,1,3)
val combinations = nums.combinations (2). ToList//list (List (1,1), List (1,3))
Val permutations = nums.permutations.toList //List (list (1,1,3), List (1,3,1), List (3,1,1))
19.zip, Zipall, Zipwithindex, UNZIP,UNZIP3
ZIP:ZIP[B] (That:geniterable[b]): list[(A, B)] with another list zipper operation, the corresponding position of the elements of a pair, the return of the list length is two list of the short one
ZIPALL:ZIPALL[B] (that:collection. Iterable[b], thiselem:a, thatelem:b): list[(A, B)] with another list for the zipper operation, the corresponding position of the elements to form a pair, if the list length is inconsistent, the list of their own relatively short words use Thiselem to fill, A shorter list of each other, use Thatelem to fill
Zipwithindex:zipwithindex:list[(A, Int)] zips the list element with its index to form a pair
UNZIP:UNZIP[A1, A2] (implicit aspair: (A) ⇒ (A1, A2)): (LIST[A1], LIST[A2]) Undo zipper Operation
UNZIP3:UNZIP3[A1, A2, A3] (implicit astriple: (A) ⇒ (A1, A2, A3)): (LIST[A1], list[a2], LIST[A3] 3 elements of the unzipped operation
Val alphabet = List ("A", B, "C")
val nums = list (1,2)
val zipped = Alphabet zip nums //List (("A", 1), ("B", 2))
val Zippedall = Alphabet.zipall (Nums, "*", -1) //List ("A", 1), ("B", 2), ("C", -1))
val Zippedindex = Alphabet.zipwithindex //List ("A", 0), ("B", 1), ("C", 3))
val (list1,list2) = Zipped.unzip //List1 = List (" A "," B "), List2 = List (1,2)
val (l1,l2,l3) = List (1," One ", ' 1 '), (2," Two ", ' 2 '), (3," three ", ' 3 ')). UNZIP3 //l1= List (1,2,3), L2=list ("One", "two", "three"), L3=list (' 1 ', ' 2 ', ' 3 ')
20.slice
Slice (From:int, until:int): List[a] Extract list of elements from position from location to location until (without that location) in the list
Val nums = List (1,2,3,4,5)
val sliced = Nums.slice (2,4) //list (3,4)
21.sliding
Sliding (Size:int, Step:int): Iterator[list[a]] group the list by fixed-size size, stepping to step,step defaults to 1, returning the result to an iterator
Val nums = List (1,1,2,2,3,3,4,4)
val groupStep2 = nums.sliding (2,2). ToList //list (List (1,1), List (2,2), list ( 3,3), List (4,4))
val groupStep1 = nums.sliding (2). ToList//list (List (1,1), List (1,2), List (2,2), list (2,3), list ( 3,3), list (3,4), List (4,4))
22.updated
Updated (Index:int, elem:a): List[a] Update operations on an element in the list
Val nums = List (1,2,3,3)
val fixed = nums.updated (3,4) //list (1,2,3,4)