2, Scala simple example
Reference Tutorial: HTTPS://YQ.ALIYUN.COM/TOPIC/69 2.1 Interactive Programming
Spark-shell is spark interactive operating mode, provides interactive programming, side-knocking code side execution, do not need to create program source files, convenient debugging procedures, conducive to rapid learning spark.
[Root@node1 spark-2.2.0]# Bin/spark-shell Using Spark ' s default log4j profile:org/apache/spark/
Log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use Sc.setloglevel (Newlevel).
For Sparkr, use Setloglevel (Newlevel). 17/09/03 06:32:38 WARN nativecodeloader:unable to load Native-hadoop library for your platform ... using Builtin-java clas SES where applicable 17/09/03 06:32:56 WARN objectstore:failed to get database global_temp, returning NOSUCHOBJECTEXCEPTI On Spark the context Web UI available in http://192.168.80.131:4040 Spark context available as ' sc ' (master = local[*], app ID
= local-1504434761542).
Spark session available as ' Spark '. Welcome to __/__/__ ___ _____//__ _\ \/_ \/_ '/__/' _//___/. __/\_,_/_//_/\_ \ version 2.2.0/_/Using Scala version 2.11.8 (Java HotSpot (TM) 64-bit Server VM, Java 1.8.0_112) Type in Expres
Sions to have them evaluated.
Type:help for more information. Scala>
Spark built-in Scala environment, you can see the scala> after entering the Spark-shell, you can enter the Scala statement directly, the carriage return is executed. 2.3 Data Types
Scala has extremely reused Java types, and Scala's Int type represents the original Java integer type int,float represents a Boolean Float,boolean, and the array is mapped to a Java array. Scala also reuses many of the standard Java library types. For example, the string literal in Scala is Java.lang.String, and the thrown exception must be a java.lang.Throwable subclass.
scala> var x:int =10
x:int = ten
scala> var y:double =3.14 y:double
= 3.14 scala>
var s:string = "Hello,scala!"
s:string = hello,scala!
scala> var b:boolean=true
B:boolean = True
scala>
Note: The semicolon of the Scala statement is optional and usually does not write 2.2 Scala variables
In Scala, use the keyword "var" to declare a variable, using the keyword "val" to declare a constant. Declaring variables and constants in Scala does not necessarily indicate the data type, and the data type is inferred from the initial value of the variable or constant without specifying the data type.
Therefore, if you declare a variable or constant without specifying the data type, you must give its initial value, or you will get an error.
scala> var x=1000
x:int = 1000
scala> var y=3.14
y:double = 3.14
scala> val s= "Hello,world"
s:string = Hello,world
scala>
Note: The value of the Val variable can only be initialized once, and an error is assigned again, and VAR and Java variables are the same and can be modified at any time. Val is the style of functional programming, and variables are not modified once they are assigned. 2.3 Basic Operations
(1) arithmetic Operation
scala> 1+2
res1:int = 3
scala> 2*3-5/2 res2:int
= 4
scala> 7%3 res3:int
= 1
(2) relational Operation
scala> var a=5
a:int = 5
scala> var b=3 b:int
= 3
scala> a==b res7:boolean
= False
scala> a>=b
Res8:boolean = True
scala> a!=b
Res9:boolean = True
(3) logical Operation
scala> var a=true;
A:boolean = True
scala> var b=false;
B:boolean = False
scala> a && b
Res4:boolean = False
scala> A | | b
Res5:boolean = True
scala>! (a&&b)
Res6:boolean = True
scala>
(4) Assignment operator
scala> var x=2
x:int = 2
scala> x+=3 scala> println
(x)
5
scala> x*=3
Scala > println (x)
scala>
Note: Scala does not provide + + and – The budget character println is a standard output function of Scala's predefined imports, so you can directly use
(5) operator overloading
Scala allows method invocation form A.fun (b) abbreviated to a fun B.
scala> 1.to
= Range (1, 2, 3, 4, 5, 6, 7, 8, 9, res15:scala.collection.immutable.Range.Inclusive)
scala& Gt 1 to ten
res16:scala.collection.immutable.Range.Inclusive = Range (1, 2, 3, 4, 5, 6, 7, 8, 9, ten)
scala> "Hello" + ", world"
res17:string = hello,world
scala> "Hello". + (", World")
res18:string = Hello,world
Scala >
In other words, the + operation of the string is actually called the + method, the operator is overloaded the to operation can generate an interval set 2.4 branch statements
scala> var x=3
x:int = 3
scala> if (x<10)
| println ("x<10")
x<10
scala> if (x<10) {
| println ("x<10")
|} else{
| println ("x>10")
|}
X<10
scala>
scala> var x=10
x:int = ten
scala> if (x<0) {
| println ("x<0")
|} else if (x<10) {
| println ("0<=x<10")
|} else{
| println ("x>=10")
|}
x>=10
scala>
2.4 Circular Statements
(1) while Loop
scala> var i=1
i:int = 1
scala> while (i<=100) {
| Sum+=i
| I=i+1
|}
scala> println (sum)
5050
(2) for Loop
Scala's for loop is similar to the Java enhanced for loop, in the basic form of A for a <-set , a equivalent to a set of generic elements, used to traverse the collection, the <-arrow symbol is similar to the Java enhanced for loop colon, <- Represents a generator.
scala> var sum=0
sum:int = 0
scala> for (i <-1 to) {
| Sum+=i
|}
scala> println (sum)
5050
scala>
Note: Scala uses the "<-" symbol for the assignment of the loop variable I in the For loop, and 1 to 100 specifies a range
There is also a keyword until in Scala that has a similar effect to the previous to keyword, and it differs in that it does not include the last element
Scala> for (I <-1 until) {
| println ("I is" + i);
|}
2.5 function
First of all, the function/variable is a first-class citizen, the function is equal to the variable, the definition of the function can be defined separately, can be independent of the class, interface, or object, and it can be used independently and assigned to the variable.
The calculations in spark are done using Scala's functional programming.
(1) library function
Import scala.math._
Import scala.math._
scala> val x=2
x:int = 2
scala> sqrt (x)
res4:double = 1.4142135623730951
Note: In Scala, the _ character is "wildcard", similar to the Java
(2) Custom Function
The definition of a function begins with def. Each function argument must be followed by a type callout with a prefix colon, because the Scala compiler cannot infer the function parameter type. The Scala function definition format is as follows:
def functionname ([parameter list]): [return type] = {
function body return
[expr]
}
Define a function to solve the maximum value
Scala> def Max (X:int, y:int): int = if (x < y) y else x
max: (X:int, y:int) int
scala> max (3,5)
Res9:int = 5
Note: Scala functions can have no return statement, and the last value is returned by default.
If the function's arguments appear only once in the body of the function, you can use underscores instead
Scala> def mul (x:int,y:int) =x*y
mul: (X:int, Y:int) Int
scala> mul (2,3)
res26:int = 6
scala> D EF mul= (_:int) * (_:int)
mul: (int, int) => Int
scala> mul (3,4)
Res27:int = 12
(3) variable parameters
Scala allows you to specify that the last parameter of a function can be duplicated.
scala> def prints (args:string*) ={
| for (Arg <-args) {
| println (ARG)
| }
|}
Prints: (args:string*) unit
scala> prints ("AA", "BB", "CC")
AA
bb
cc
scala>
Note: If the function does not return a value, it can be returned to unit, a Java-like void
(4) function Assignment
You can assign a function to a variable,
Val variable name = function name + Space +_
There must be a space behind the function name to indicate that it is the prototype of the function
Scala> Val Fmax=max _
fmax: (int, int) => Int = <function2>
scala> fmax (3,5)
res7:int = 5
scala>
(5) anonymous function
anonymous function format:
Val variable name = (parameter: type) => function body
scala> var increase = (X:int) => x + 1
increase:int => Int = <function1>
scala> var n=1;
N:int = 1
scala> println (Increase (n))
2
Program Description: (x:int) => x + 1 defines an anonymous function,=> representing the left side of the parameter to the right side of the process to assign the anonymous function to the increase variable, through the function variable can be like a normal function operation
(6) Higher order function
Because the parameters of a function can be variables, and functions can be assigned to variables, that is, functions and variable status, so function arguments can also be functions. In Scala, higher-order functions are allowed, and higher-order functions can use other functions as arguments, or use functions as output results.
The definition syntax for a common function is as follows:
def funname (para1:type1,para2:type2): Type = {do some things}
Higher-order functions are in fact the general function of the parameters to further promote the higher order function can be a function of the parameter name is the function name, then the corresponding type of special parameters how to write it. This translates the problem into the question of finding the type of function. The type of the function, in fact, is the type of input and output.
Take a look at the following example:
Scala> Import scala.math._
import scala.math._
scala> def valuefor (f: (Double) =>double,value:double )={
| F (value)
|}
Valuefor: (f:double => Double, value:double) Double
scala> valuefor (ceil _,0.25)
res1:double = 1.0
SC Ala> valuefor (sqrt _,0.25)
res2:double = 0.5
scala>
Description: Valuefor The first argument is a function argument, F is the parameter name, (Double) =>double is the type of the argument, the second parameter is a normal argument, the parameter name is value, and the argument type is Double; the Valuefor function definition can be abbreviated to DEF Valuefor (f: (Double) =>double,value:double) =f (value)
Look at one more example:
The map method takes a function argument, applies it to each element in the array, and returns the new array.
Scala> Import scala.math._
import scala.math._
scala> val num = 3.14 num:double
= 3.14
Scala> val func = Ceil _
func:double => Double = <function1>
scala> val array = array (1.0,3.14,4 ). Map (func)
array:array[double] = Array (1.0, 4.0, 4.0)
scala> for (i<-array) print (i+ "")
1.0 4.0 4.0
scala>
(7) Closure
Closures can be simply considered to be another function that can access local variables within a function.
scala> var factor = 3
factor:int = 3
scala> val multiplier = (i:int) => I * factor
multiplier:int = > Int = <function1>
scala> println (multiplier (1))
3
scala> println (multiplier (2))
6
scala>
(8) Curry (currying)
The Gerty function converts a function that takes over two parameters to a new function process that receives a parameter. The new function returns a function that takes the original second argument as a parameter.
The definition of the Gerty function
Scala> def mul (x:int) = (y:int) =>x*y
mul: (x:int) int => int scala>
mul (2) (3)
res19:int = 6< C14/>scala> (Mul (2)) (3)
Res23:int = 6
scala>
Note: Mul (2) (3) is actually computed in the form of (Mul (2)) (3), Mul (2) (Y:int) =>2*y, and the new function receives the parameter 3 to obtain the result 6.
Scala can abbreviate the function of the Gerty
Scala> def mul (X:int) (y:int) =x*y
mul: (x:int) (y:int) Int
scala> Mul (5) (3)
res20:int = 15
Look at one more example
Scala> def strcat (s1:string) (s2:string) = s1 + s2
strcat: (s1:string) (s2:string) String
scala> strcat ("he Llo ") (" World ")
res22:string = HelloWorld
2.6 String
scala> val msg = "Hello"
msg:string = Hello
scala> println (msg.length)
5
scala> println ( Msg.charat (0))
H
scala> println (Msg.compareto ("HI"))
-4
scala> println (Msg.equals (" Hello "))
true
scala> val s=msg+", spark "
s:string = Hello,spark scala> println
(s.substring (6))
Spark
2.7 Array
scala> var a1 = array ("QQ", "Baidu", "Google")
a1:array[string] = Array (QQ, Baidu, Google)
scala> for (X&L T;-A1) println (x)
QQ
Baidu
Google
scala> a1.foreach (x => println (x))
QQ
Baidu
Google
scala> A1.foreach (println (_))
QQ
Baidu
google
scala> A1.foreach ( println)
QQ
Baidu
Google
scala> var A2 =new array[string] (3)
a2:array[string] = array (null, NULL, NULL)
scala> Array (0) = "Hello"
scala> A2 (0) = "Hello"
scala> a2 (1) = "Spark"
scala> A2 (2) = "!"
Scala> println (A2 (1))
Spark
Array is a fixed-length array, and Arraybuffer is a mutable array. Arraybuffer corresponds to the ArrayList in Java.
Scala> Import scala.collection.mutable.ArrayBuffer
import Scala.collection.mutable.ArrayBuffer
Scala > Val Ab=arraybuffer[int] ()
ab:scala.collection.mutable.arraybuffer[int] = Arraybuffer ()
scala> AB + 1
Res7:ab.type = arraybuffer (1)
scala> ab + = (2,3,4,5)
res8:ab.type = Arraybuffer (1, 2, 3, 4, 5)
Scala > Val array=array (7,11,13,17)
array:array[int] = Array (7, a)
scala> ab ++= array
Res9:ab.typ E = Arraybuffer (1, 2, 3, 4, 5, 7, one,)
scala> Ab.foreach (println)
1
2 3 4 5 7
scala> val a1=ab.toarray
A1:array[int] = Array (1, 2, 3, 4, 5, 7, one)
Scala> val buffer=a1.tobuffer
Buffer:scala.collection.mutable.buffer[int] = Arraybuffer (1, 2, 3, 4, 5, 7, 11, 13, scala>)
Note: + = To add elements at the end of the Arraybuffer, wrap them in parentheses when adding multiple elements ++= indicates that any collection toarray is appended to the Arraybuffer to array Toarraybuffer represents the conversion of array to Arraybuffer 2.8 List
Scala lists are similar to arrays, all of which are of the same type, but they are also different: The list is immutable, the value cannot be changed once it is defined, and then the list has a recursive structure (that is, a linked table structure) and the array is not.
The element type T of the list can be written as list[t].
Scala's List,scala. The list, unlike the Java java.util.List, is always immutable (and the Java list is mutable). More generally, Scala's list is designed for functional-style programming.
Scala> val fruit:list[string] = list ("Apples", "oranges", "pears")
fruit:list[string] = list (apples, oranges, pea RS)
scala> println (fruit (0))
apples
scala> println (fruit.head)
apples
scala> (Fruit.tail)
List (oranges, pears)
scala> println (fruit.isempty)
false
scala> fruit.length
= 3
scala> Fruit.foreach (x => println (x))
apples
oranges
pears
scala> (println)
Apples
oranges
pears
scala>
can use