Start Jupyter notebook in Pyspark

Source: Internet
Author: User
Tags jupyter jupyter notebook pyspark

Or are you going to choose Python to learn spark programming

Because the Java write function is more complex, Scala learning curve is steep, and the combination of SBT and Eclipse and Maven is a bit of a crash, often can't find the main class to execute

Python hasn't used it before, but it's a reputation, and it's easy to process data.

Integrating the Pydev plugin in eclipse to write a Python program has been studied

Today I used a python development environment with Anaconda integration, and it felt good.

Especially Ipython notebook or jupyter notebook are easy to visualize

But how do you start in Pyspark?

Check out some of the English literature is configured under Linux

Ipython profile Create Spark

Creates some configuration scripts that are required for startup, after they are set in the script

Ipython Notebook--profile Spark

You can start notebook in Pyspark, but I'm not a success.

And then I saw an easy way

The Python interpretation environment can be transferred to Jupyter notebook by adding two variables that need to be checked at startup Pyspark directly in the Windows environment variable

The first variable is a pyspark_driver_python:jupyter

Another variable is Pyspark_driver_python_opts:notebook

If this is started from the command line (double-clicking startup is not possible), you can open a Web service in notebook and the Py script will run on Spark.

Reference documents:

Http://www.cnblogs.com/NaughtyBaby/p/5469469.html
http://blog.csdn.net/sadfasdgaaaasdfa/article/details/47090513
http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/

Spark machine learning by Nick Pentreath

Start Jupyter notebook in Pyspark

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.