MAC configuration Spark Environment Scala+python version (Spark1.6.0) __python

Source: Internet
Author: User
Tags pyspark in python

1. Download spark installation package from the official website and extract it to your own installation directory (the default has been installed JDK,JDK installed to find it yourself); Spark Official website: http://spark.apache.org/downloads.html

2. Enter the system command line interface, enter the installation directory, such as "/installation directory/spark-1.6.0-bin-hadoop-2.6.0", enter the command "./bin/pyspark" Verify that Pyspark can run, and then enter commands./bin/ Spark-shell "To see if the Scala environment can run. Success will display spark graphics and can be entered in Python or Scala command lines. The following figure (Python edition):

3. For Python version, first download pycharm, click Finish installation. Create a new project, open edit configuration, find environment variables, click on the back of the edit box, in the variable column to add Pythonpath, the value is SPARK directory/PYTHON,SPARK_HOMR, The value is Spark installation directory, click OK to exit.

4. If you want to use Python to download the py4j package, use the command line to enter the "Easy_install py4j" command on the line. Then go into the Spark installation directory under the Python folder, open the Lib folder, the inside of the PY4J compression package copied to the next Level Python folder, decompression.

5. Write a good demo in Pycharm, click to run. The demo example is as follows:

"" "simpleapp.py" "" from
Pyspark import sparkcontext

logFile = "/spark/spark-1.6.0-bin-hadoop2.6/readme.md"  # Should is some file on your system
SC = sparkcontext (' local ', ' Simple App ')
logdata = Sc.textfile (logFile). Cache ()

Numas = Logdata.filter (lambda s: ' A ' in s). Count ()
numbs = Logdata.filter (lambda s: ' B ' s). Count ()

Print ("Lines with a:%i, Lines with B:%i"% (Numas, numbs))
 

6. If using Scala environment, then need to download IntelliJ idea, and Pycharm is the same company produced, directly to search the name of the official website under the free version. You will be prompted to install the plugin the first time you open it, choose to install Scala plug-in, spark1.6 corresponds to scala2.10 version, about 47M. After the plugin is downloaded, you can create a new Scala project.

7. Click the file option on the IntelliJ IDE menu bar, select Project Structure, click the left libraries in the pop-up dialog box, and then hit the green "+" number above and add assembly under the Lib folder in Spark Jar package, click Apply.

8. Then find a demo from the Spark official web site and replace the spark path inside with your own. Open edit configuration, click on the upper left corner plus, select Application, enter the settings Run Configuration dialog box, according to the following figure configuration, which program arguments manually input local, and then right click the main function address, The system is automatically added. The value of the VM options is to set up a stand-alone operation and not set an error.

9. Click OK to complete the configuration, run the program can be.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.