--by default, the Sparkcontext object is initialized with Namesc when Spark-shell is started. Use the following command to create the SqlContext. Val SqlContext=New Org.apache.spark.sql.SQLContext (SC)--employee.json-Place this file in the same directory as the currentscala> pointer. {{"id": "1201"," name ":" Satish "," Age ":" -"} {"id": "1202"," name ":" Krishna "," Age ":" -"} {"id": "1203"," name ":" Amith "," Age ":" the"} {"id": "1204"," name ":" Javed "," Age ":" at"} {"id": "1205"," name ":" Prudvi "," Age ":" at"}}--reads the JSON document Namedemployee.json. The data is displayed as a table with fields Id,name and age. Val Dfs=SqlContext.Read. JSON ("/Root/Wangbin/Employee.json ")--Show Datadfs.show ()--View Data StructuresDfs.printschema ()--View a columnDfs.Select("name"). Show ()--look for employees older than (Age> 23). Dfs.filter (Dfs ("age")> at). Show ()--calculate the number of employees at the same age. Dfs.groupby ("Age").Count(). Show ()
The JSON data processing of spark