Configure Ipython Nodebook run Python Spark program

Source: Internet
Author: User
Tags deprecated jupyter jupyter notebook pyspark

Configure Ipython Nodebook Run Python Spark Program 1.1, install Anaconda

Anaconda's official website is https://www.anaconda.com, download the corresponding version;

1.1.1, download Anaconda
$ cd /opt/local/src/$ wget -c https://repo.anaconda.com/archive/Anaconda3-5.2.0-Linux-x86_64.sh
1.1.2, Installation Anaconda
# 参数 -b 表示 batch -p 表示指定安装目录$ bash Anaconda3-5.2.0-Linux-x86_64.sh -p /opt/local/anaconda -b
1.1.3, configuring Anaconda related environment variables
    • Configuring Environment variables
$ tail -n 8 ~/.bashrc# Anaconda3export ANACONDA_PATH=/opt/local/anacondaexport PATH=$ANACONDA_PATH/bin:$PATH# PySparkexport PYSPARK_DRIVER_PYTHON=$ANACONDA_PATH/bin/ipythonexport PYSPARK_PYTHON=$ANACONDA_PATH/bin/python
    • Enabling environment variables
$ source ~/.bashrc
    • Verify
$ python --versionPython 3.6.5 :: Anaconda, Inc.
1.2, in Ipython Notebook use pySpark1.2.1, create working directory
$ mkdir  ~/ipynotebook$ cd ~/ipynotebook
1.2.2, Ipython Notebook run Pyspark
    • Run Ipython Notebook
$ Pyspark_driver_python=ipython pyspark_driver_python_opts= "notebook" Pyspark[terminalipythonapp] WARNING | Subcommand ' Ipython notebook ' is deprecated and'll be removed in the future versions. [Terminalipythonapp] WARNING | Likely want to use ' Jupyter notebook ' in the Future[i 14:21:56.030 Notebookapp] Jupyterlab Beta preview extension load Ed From/opt/local/anaconda/lib/python3.6/site-packages/jupyterlab[i 14:21:56.030 NotebookApp] Jupyterlab Application directory Is/opt/local/anaconda/share/jupyter/lab[i 14:21:56.037 Notebookapp] Serving notebooks from local Directory:/home/hadoop/ipynotebook[i 14:21:56.037 Notebookapp] 0 active kernels[i 14:21:56.037 Notebookapp] the Jupyter Notebook is running at:[i 14:21:56.037 Notebookapp] http://localhost:8888/?token= 5b68718fdabe4488decf07703a3bd76bf46d5dc733a6617d[i 14:21:56.037 Notebookapp] Use CONTROL-C to stop this server and shut D Own all kernels (twice to skip confirmation).     [C 14:21:56.040 Notebookapp] Copy/paste this URL to your Browser when do connect for the first time, to login with a token:http://localhost:8888/?token=5b68718fdabe44 88decf07703a3bd76bf46d5dc733a6617d&token=5b68718fdabe4488decf07703a3bd76bf46d5dc733a6617d[i 14:21:56.683 Notebookapp] Accepting one-time-token-authenticated connection from 127.0.0.1

will automatically open the http://localhost:8888 page via the default browser

    • Write a program on Ipython Notebook

1.2.3, Ipython Notebook runs in Hadoop Yarn Pyspark
    • Run Ipython Notebook
$ Pyspark_driver_python=ipython pyspark_driver_python_opts= "Notebook" hadoop_conf_dir=/opt/local/hadoop/etc/ Hadoop master=yarn-client Pyspark[terminalipythonapp] WARNING | Subcommand ' Ipython notebook ' is deprecated and'll be removed in the future versions. [Terminalipythonapp] WARNING | Likely want to use ' Jupyter notebook ' in the Future[i 14:50:48.149 Notebookapp] Jupyterlab Beta preview extension load Ed From/opt/local/anaconda/lib/python3.6/site-packages/jupyterlab[i 14:50:48.149 NotebookApp] Jupyterlab Application directory Is/opt/local/anaconda/share/jupyter/lab[i 14:50:48.157 Notebookapp] Serving notebooks from local Directory:/home/hadoop/ipynotebook[i 14:50:48.157 Notebookapp] 0 active kernels[i 14:50:48.157 Notebookapp] the Jupyter Notebook is running at:[i 14:50:48.157 Notebookapp] http://localhost:8888/?token= 8fe2c599dc39a23104dd6a058a0e05de3d9e88cfeda71b45[i 14:50:48.157 Notebookapp] Use CONTROL-C to stop this server and shut D Own all kernels (twice to skip confirmation).     [C 14:50:48.161 Notebookapp] Copy/paste this URL into your browser when you connect for the first time, to login with a Token:http://localho st:8888/?token=8fe2c599dc39a23104dd6a058a0e05de3d9e88cfeda71b45&token= 8fe2c599dc39a23104dd6a058a0e05de3d9e88cfeda71b45
    • Write a program on Ipython Notebook

    • viewing tasks in yarn
$ yarn application -list18/06/24 14:53:06 INFO client.RMProxy: Connecting to ResourceManager at node/192.168.20.10:8032Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):1                Application-Id      Application-Name        Application-Type          User       Queue               State         Final-State         Progress                        Tracking-URLapplication_1529805293111_0001          PySparkShell                   SPARK        hadoop     default             RUNNING           UNDEFINED              10%                    http://node:4040
1.2.4, Ipython Notebook run Alone in Spark stand Pyspark
    • Start Spark stand Alone
$ /opt/local/spark/sbin/start-master.sh$ /opt/local/spark/sbin/start-slaves.sh$ jps13249 Jps13027 Master13188 Worker
    • Run Ipython Notebook
$ Pyspark_driver_python=ipython pyspark_driver_python_opts= "Notebook" master=spark://node:7077 PYSPARK-- Num-executors 1--total-executor-cores 1--executor-memory 512m [Terminalipythonapp] WARNING | Subcommand ' Ipython notebook ' is deprecated and'll be removed in the future versions. [Terminalipythonapp] WARNING | Likely want to use ' Jupyter notebook ' in the Future[i 15:11:59.211 Notebookapp] Jupyterlab Beta preview extension load Ed From/opt/local/anaconda/lib/python3.6/site-packages/jupyterlab[i 15:11:59.212 NotebookApp] Jupyterlab Application directory Is/opt/local/anaconda/share/jupyter/lab[i 15:11:59.230 Notebookapp] Serving notebooks from local Directory:/home/hadoop/ipynotebook[i 15:11:59.230 Notebookapp] 0 active kernels[i 15:11:59.230 Notebookapp] the Jupyter Notebook is running at:[i 15:11:59.230 Notebookapp] http://localhost:8888/?token= 1972eb523fea28d541985df7ed2ce55cc2bfada7e31eb9ea[i 15:11:59.230 Notebookapp] Use CONTROL-C to stop this server and shut D Own all kernels (twice to skip confirmation).     [C 15:11:59.233 Notebookapp] Copy/paste this URL into your browser when you connect for the first time, to login with a Token:http://localho st:8888/?token=1972eb523fea28d541985df7ed2ce55cc2bfada7e31eb9ea&token= 1972eb523fea28d541985df7ed2ce55cc2bfada7e31eb9ea[i 15:12:02.594 Notebookapp] Accepting one-time-token-authenticated connection from 127.0.0.1
    • Write a program on Ipython Notebook

    • View the Spark Standalone Web UI Interface
1.3. Summary

Start Ipython Notebook, first enter the working directory of Ipython Notebook, as ~/ipynotebook this is determined according to the actual situation;

1.3.1, Local start Ipython Notebook
PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" pyspark#### 或者PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" pyspark --master local[*]
1.3.2, Hadoop YARN start Ipython Notebook
PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" HADOOP_CONF_DIR=/opt/local/hadoop/etc/hadoop MASTER=yarn-client pyspark#### 或者PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" HADOOP_CONF_DIR=/opt/local/hadoop/etc/hadoop pyspark --master yarn --deploy-mode client
1.3.2, Spark stand Alone start Ipython Notebook

Configure Ipython Nodebook run Python Spark program

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.