International - English

Cart Console

Topic Center

Contact Sales

Home > Developer > Windows

Under Windows Pycharm Development Spark

Last Update:2016-05-12 Source: Internet

Author: User

Tags pyspark

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

deploy a local spark environment
1.1 Install the JDK.Download and install the jdk1.7 and configure the environment variables.
1.2 Spark environment variable configuration go to http://spark.apache.org/downloads.html website to download the corresponding version of Hadoop, I downloaded isspark-1.6.0-bin-hadoop2.6.tgz, the spark version is 1.6 and the corresponding Hadoop version is 2.6

Unzip the downloaded file, assuming that the extracted directory is: D:\spark-1.6.0-bin-hadoop2.6. Add D:\spark-1.6.0-bin-hadoop2.6\bin to the system path variable and create a new spark_home variable with the value: D:\spark-1.6.0-bin-hadoop2.6

1.3 Installation of Hadoop related packages

Spark is based on Hadoop and calls the relevant Hadoop libraries during the run. If you do not configure the relevant Hadoop runtime environment, you will be prompted with the relevant error message, although it does not affect the operation.

to download Hadoop 2.6 compiled package https://www.barik.net/archive/2015/01/19/172716/ I downloaded the hadoop-2.6.0.tar.gz Unzip the downloaded folder, add the related library to the system PATH variable: D:\hadoop-2.6.0\bin; Create a new hadoop_home variable with the value: D:\ hadoop-2.6.0. Go to GitHub and download a component called Winutils address is https://github.com/srccodes/ Hadoop-common-2.2.0-bin if there is no version of Hadoop (at this point the version is 2.6), go to csdn download http://download.csdn.net/detail/luoyepiaoxin/8860033,

My practice is to copy all the files in this CSDN package into the Hadoop_home bin directory.

Two Python environments

spark offers 2 interactive shells, one pyspark (based on Python) and one Spark_shell (based on Scala). These two environments are actually tied together, There is no interdependence, so if you're just using the Pyspark interactive environment instead of using Spark-shell, even Scala doesn't need to be installed.

&NBSP; 2.1 Download and install a Naconda

Anaconda is a system that integrates the Python interpreter and most Python libraries, so you can install Anaconda without installing Python and pandas numpy. It's https://www.continuum.io/downloads. Add python to the PATH environment variable

Three-start Pyspark verification

start Pyspark in the command line under Windows:

Four configuring the development environment in Pycharm

4.1 Configuring Pycharm
more detailed material reference Https://stackoverflow.com/questions/34685905/how-to-link-pycharm-with-pyspark

Open Pycharm and create a project. Then select "Run", "Edit configurations"

Select "Environment variables"Add the Spark_home directory to the Pythonpath directory.

spark_home:spark installation directory
pythonpath:spark The Python directory under the installation directory

< Span style= "font-size:17px" >

4.2 Test procedure

Before you test the environment correctly, the code is as follows:

import osimport sys # Path for spark source folderos.environ[‘SPARK_HOME‘]="D:\javaPackages\spark-1.6.0-bin-hadoop2.6" # Append pyspark to Python Pathsys.path.append("D:\javaPackages\spark-1.6.0-bin-hadoop2.6\python") try: from pyspark import SparkContext from pyspark import SparkConf  print ("Successfully imported Spark Modules") except    importerror    as   e  :    print     ( " Can not import Spark Modules "   e    sys.exit(1)

If the program can output normally:"successfully imported Spark Modules"This indicates that the environment is already operational. For example, within the yellow box is the specific spark environment and the Python environment:

The test program code comes from GitHub: Https://gist.github.com/bigaidream/40fe0f8267a80e7c9cf8

Under Windows Pycharm Development Spark

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

apache spark windows apache spark download for windows pycharm documentation pycharm free pycharm git python pycharm php development environment windows

Unable to access Windows Installer service while attempting t... 02-28

How to remove bootcamp Windows partition in Mac dual system 02-28

Scrapy installation-Windows 12-08

Using word to convert a PDF file to a DOC file 12-04

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Under Windows Pycharm Development Spark

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support