1. Install jkd1.8 (no longer described here)
2. Enter pip install Pyspark directly at the terminal (the simplest installation method available on the website)
The process is as follows:
collecting Pyspark downloading https:files.pythonhosted.org/packages/ee/2f/709df6e8dc00624689aa0a11c7a4c06061a7d00037e370584b9f011df44c/ Pyspark-2.3.1.tar.gz (211. 9MB)100% |████████████████████████████████| 211.9MB 8.3kb/s requirement already satisfied:py4j==0.10.7inch./anaconda3/lib/python3.6/site-packages ( fromPyspark) Building Wheels forcollected Packages:pyspark Running setup.py bdist_wheel forPyspark . Done StoredinchDirectory:/home/tan/.cache/pip/wheels/37/48/54/f1b63f0dbb729e20c92f1bbcf1c53c03b300e0b93ca1781526successfully built pysparkinstalling collected packages: Pysparksuccessfully installed Pyspark-2.3.1
After the installation is complete, the terminal input Pyspark, start Pyspark error ...
[Email protected]:~ is not set
Workaround:
Find the installation path for Pyspark
[Email protected]:~ in./anaconda3/lib/python3.6/site-Packagesrequirement already satisfied: py4j in./anaconda3/lib/python3.6/site-packages ( from Pyspark)
Once the path is found, add the JDK installation path to the load-spark-env.sh file
Export java_home=/home/tan/jdk1.8.0_181
Once saved, enter Pyspark again at the terminal to successfully start the Pyspark
[Email protected]:~$ Pysparkpython3.6.4 | Anaconda, inc.| (Default, Jan 16 2018, 18:10:19) [GCC7.2. 0] on Linuxtype" Help","Copyright","credits" or "License" forMore information.2018-07-29 12:37:48 WARN utils:66-your hostname, tan-precision-tower-3620 resolves to a loopback address:127.0.1.1; Using 192.168.0.100instead (on interface Enp0s31f6)2018-07-29 12:37:48 WARN Utils:66-set spark_local_ipifYou need to bind to another address2018-07-29 12:37:48 WARN nativecodeloader:62-unable to load Native-hadoop Library forYour platform ... using builtin-Java classes where applicablesetting default log level to"WARN". To adjust logging level use Sc.setloglevel (Newlevel). For Sparkr, use Setloglevel (Newlevel). Welcome to____ __ / __/_____ _____/ /__ _\ \/ _ \/ _ `/ __/'_//__/. __/\_,_/_//_/\_\ Version 2.3.1/_/Using Python Version3.6.4 (Default, Jan 16 2018 18:10:19) Sparksession available as'Spark'.>>>
End
Installation of Pyspark under Ubuntu