Spark:ValueError:Cannot run multiple sparkcontexts at once solution

Last Update:2018-07-24 Source: Internet

Author: User

Tags jupyter notebook pyspark

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Yesterday spent an afternoon to install spark, and Pyspark shell editing interface to Jupyter notebook, and then in accordance with the "Spark fast large data analysis" This book taste fresh, feel the power of spark.

My system is Win7,spark 1.6,anaconda 3,python3. The code is as follows:

Lines = Sc.textfile ("D://program files//spark//spark-1.6.0-bin-hadoop2.6//readme.md") 
print ("Number of lines of text", Lines.count ()) from

pyspark import sparkcontext

logFile = "D://program files//spark// Spark-1.6.0-bin-hadoop2.6//readme.md "  # Should is some file on your system
SC = sparkcontext (' local '," Simple App ")
Logdata = Sc.textfile (logFile). Cache ()

Numas = Logdata.filter (lambda s: ' A ' in s). Count () 
numbs = Logdata  . Filter (lambda s: ' B ' s). Count ()
pythonlines = Lines.filter (lambda line: "Python" in line)
print ("lines with a: %i, lines with B:%i "% (Numas, numbs))

The results are as follows, error ValueError:

Number of lines of text---------------------------------------------------------------------------ValueError  Traceback (most recent) <ipython-input-3-70ecab39b7ea> in <module> () 5 6 logFile = "D://program files//spark//spark-1.6.0-bin-hadoop2.6//readme.md" # Should to some file on your system----> 7 SC = Sp Arkcontext ("local", "Simple App") 8 Logdata = Sc.textfile (logFile). Cache () 9 D:\spark\spark-1.6.0-bin-hadoop 2.6\python\pyspark\context.py in __init__ (self, master, AppName, Sparkhome, Pyfiles, Environment, batchsize, serializer , conf, Gateway, JSC, PROFILER_CLS) "" "Self._callsite = First_spark_call () or Callsite (No             NE, none, none)--> 112 sparkcontext._ensure_initialized (self, gateway=gateway) 113 try:114 Self._do_init (Master, AppName, Sparkhome, Pyfiles, Environment, batchsize, serializer, D:\spark\spark-1.6.0-bin -hadoop2.6\python\pyspark\context.py in _ensure_initialized (CLS, instance, Gateway) 259 ' created by%s at%s:%s ' 26 0% (Currentappname, Currentmaster,--> 261 callsite.function, call  Site.file, Callsite.linenum)) 262 else:263 Sparkcontext._active_spark_context = Instance Valueerror:cannot run multiple sparkcontexts at once; Existing Sparkcontext (App=pysparkshell, master=local[*]) created by <module> at D:\Program files\anaconda3\lib\ 
 site-packages\ipython\utils\py3compat.py:186

Bloggers Google the problem, and finally found the answer in the stack overflow this site. Originally, Valueerror:cannot run multiple sparkcontexts at once; Existing Sparkcontext (App=pysparkshell, master=local[*]) created by at D:\Program Files\anaconda3\lib\site-packages\ ipython\utils\py3compat.py:186. This means that you cannot open multiple SC (sparkcontext) at once because there is already a spark contexts, so creating a new SC will make an error. So the way to solve the problem is to turn off the existing SC to create a new SC. Then how to close it. We can use the Sc.stop () function to shut down.

Let's change the code and run it to see the result:

Lines = Sc.textfile ("D://program files//spark//spark-1.6.0-bin-hadoop2.6//readme.md") 
print ("Number of lines of text", Lines.count ())

sc.stop () #退出已有的sc from

pyspark import sparkcontext

logFile = "D://program files//spark// Spark-1.6.0-bin-hadoop2.6//readme.md "  # Should is some file on your system
SC = sparkcontext (' local '," Simple App ")
Logdata = Sc.textfile (logFile). Cache ()

Numas = Logdata.filter (lambda s: ' A ' in s). Count () 
numbs = Logdata . Filter (lambda s: ' B ' s). Count ()
pythonlines = Lines.filter (lambda line: "Python" in line)
print ("lines with a" :%i, lines with B:%i "% (Numas, numbs))

The results are as follows:

The number of lines of text
Lines with a:58 and Lines with b:26

The result is easy to fix, and mark, beware of the next problem.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More